Release v0.16.0.dev0 · keras-team/keras-hub

Summary

📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
A new preprocessor flow has been added for vision and audio models

What's Changed

Update python version in readme to 3.8 by @haifeng-jin in #618
Modify our pip install line so we upgrade tf by @mattdangerw in #616
Use Adam optimizer for quick start by @mattdangerw in #620
Clean up class name and self in calls to super() by @mbrukman in #628
Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
Add DeBERTaV3 Conversion Script by @abheesht17 in #633
Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
Create Backbone base class by @jbischof in #621
Add TPU testing by @chenmoneygithub in #591
Add Base Preprocessor Class by @abheesht17 in #638
Add keras_nlp.samplers by @chenmoneygithub in #563
Add ALBERT Backbone by @abheesht17 in #622
Add a small script to count parameters in our presets by @mattdangerw in #610
Clean up examples/ directory by @ADITYADAS1999 in #637
Fix Small BERT Typo by @abheesht17 in #651
Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
Add FNet Preprocessor by @abheesht17 in #646
Add FNet Backbone by @abheesht17 in #643
Small DeBERTa Docstring Fixes by @abheesht17 in #666
Add Fenced Docstring Testing by @abheesht17 in #640
Corrected the epsilon value by @soma2000-lang in #665
Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
Fix value_dim in TransformerDecoder's cross-attn layer by @abheesht17 in #667
Add ALBERT Presets by @abheesht17 in #655
Add Base Task Class by @abheesht17 in #671
Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
Add FNet Presets by @abheesht17 in #659
Bump the year to 2023 by @mattdangerw in #679
Add BART Backbone by @abheesht17 in #661
Handle trainable and name in the backbone base class by @mattdangerw in #680
Ignore Task Docstring for Testing by @abheesht17 in #683
Light-weight benchmarking script by @NusretOzates in #664
Conditionally import tf_text everywhere by @mattdangerw in #684
Expose token_embedding as a Backbone Property by @abheesht17 in #676
Move from_preset to base tokenizer classes by @shivance in #673
add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
import rouge_scorer directly from rouge_score package by @sampathweb in #691
Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
Temporary fix to get nightly green again by @mattdangerw in #696
GPT2 Text Generation APIs by @chenmoneygithub in #592
Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
Add AlbertClassifier by @shivance in #668
Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
Update to latest black by @mattdangerw in #708
RobertaMaskedLM task and preprocessor by @mattdangerw in #653
Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
Add start/end token padding to GPT2Preprocessor by @chenmoneygithub in #704
Don't install tf stable when building our nightly image by @mattdangerw in #711
Add OPT Backbone and Tokenizer by @mattdangerw in #699
Small OPT Doc-string Edits by @abheesht17 in #716
Default compilation other classifiers by @Plutone11011 in #714
Add BartTokenizer and BART Presets by @abheesht17 in #685
Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
Opt presets by @mattdangerw in #707
fix import of tensorflow_text in tf_utils by @sampathweb in #723
Check for masked token in roberta tokenizer by @mattdangerw in #742
Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
Fix the sampler truncation strategy by @chenmoneygithub in #713
Add ALBERT Conversion Script by @abheesht17 in #736
Add FNet Conversion Script by @abheesht17 in #737
Add BART Conversion Script by @abheesht17 in #739
Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
Improving the layer Description. by @Neeshamraghav012 in #734
Adding ragged support to SinePositionEncoding by @apupneja in #751
Fix trailing space by @mattdangerw in #755
Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
Add missing exports to models API by @mattdangerw in #763
Autogenerate preset table by @Cyber-Machine in #690
Version bump to 0.5.0 by @mattdangerw in #767
Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
Add cache support to decoding journey by @chenmoneygithub in #745
Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
Update README for 2.4.1 release by @mattdangerw in #757
Fix typo in test docstring by @jbischof in #791
Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
Patch 1 - doc-string spell fix by @atharvapurdue in #781
Don't rely on core keras initializer config details by @mattdangerw in #802
Simplify the cache decoding graph by @mattdangerw in #780
Fix Fenced Doc-String #782 by @atharvapurdue in #785
Solve #721 Deberta masklm model by @Plutone11011 in #732
Add from_config to sampler by @mattdangerw in #803
BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
Stop generation once end_token_id is seen by @chenmoneygithub in #769
Added model card links for all pretrained models. by @Cyber-Machine in #795
Initial PR demonstrating public API export logic. by @fchollet in #747
Add preset for finetuning GPT2 on CNN news by @chenmoneygithub in #807
Add API exports for metrics documented on keras.io by @shivance in #816
Add API exports for samplers documented on keras.io by @shivance in #815
Add API exports for models documented on keras.io by @shivance in #814
Add API exports for tokenizers documented on keras.io by @shivance in #817
Add API exports for layers documented on keras.io by @fchollet in #811
Add keras_nlp.utils public API exports. by @fchollet in #819
retrained bert_tiny_uncased_en_sst2_training.ipynb by @susnato in #771
Temporary solution to avoid recompilation by @chenmoneygithub in #808
Call super.config() in BartBackbone's get_config() by @shivance in #818
Update typo in README.md by @ADITYADAS1999 in #821
Add Whisper Backbone by @abheesht17 in #801
Added note for tensorflow-text in the CONTRIBUTING guide by @jaygala223 in #805
Roadmap update by @jaygala223 in #800
Remove API export decorator from base classes by @shivance in #824
Move integration tests out of repo sources. by @fchollet in #826
Function merge_padding_and_attention_mask does not return an output with the desired shape when both padding and attention masks are given by @abodinier in #790
Adding XXBackboneTPUTests by @shivance in #839
Add a t5 tokenizer by @mattdangerw in #852
Add compilation defaults for the BertMaskedLM task model by @ADITYADAS1999 in #836
added init file for t5 by @Akorex in #853
Modified Docstring for GPT2CasualLM by @TheAthleticCoder in #855
Rework bert docstrings for progressive disclosure of complexity by @mattdangerw in #843
Fix "causal" spelling in export decorator by @abheesht17 in #861
Default compilation for Albert, Distilbert, Roberta MaskedLM by @shivance in #833
Speed up default BERT testing roughly 3x by @mattdangerw in #859
Add compilation defaults for the Fnet MaskedLM task model by @soma2000-lang in #834
Default compilation for Debertav3MaskedLM model by @Cyber-Machine in #835
Remove from_preset from fnet tokenizer by @mattdangerw in #865
Add T5 backbone by @fchollet in #828
Speeding the tests for opt by @susnato in #886
Move generate compilation to the task model by @mattdangerw in #804
Speeding the tests for xlm_roberta by @susnato in #885
Rework DistilBERT docstrings for progressive disclosure of complexity. by @Cyber-Machine in #881
Speeding the tests for T5 by @susnato in #888
Rework OPT docstrings for progressive disclosure of complexity. by @Warlord-K in #893
Get our fenced docstring tests working again by @mattdangerw in #895
Speed up default RoBERTa testing roughly 3x by @shivance in #897
Speeding the tests for whisper by @susnato in #887
Update BytePairTokenizerCache to have similar dtypes for x and y in self.factors. by @Sruinard in #871
Init _backbone, _tokenizer and _preprocessor in Task by @jbischof in #899
Rework Whisper docstrings for progressive disclosure of complexity by @susnato in #903
Speed up default DeBERTa_v3 testing roughly 3x by @TheAthleticCoder in #905
Rework docstring of XLMRoberta by @abuelnasr0 in #882
Stripping the MASK token by @TheAthleticCoder in #876
Possible fix for task.summary() by @mattdangerw in #901
Speed up default FNet testing speedups. by @Cyber-Machine in #894
Added TPU test for DebertaV3Backbone by @TheAthleticCoder in #924
Fix failing TPU tests by @chenmoneygithub in #931
Add model contribution guide by @abheesht17 in #820
Resolved roberta_checkpoint by @TheAthleticCoder in #874
GLUE evaluation automation script by @susnato in #848
Ensure shape in sample so that the shape is correct after TFLite conversion by @chenmoneygithub in #902
Returning all Beams and Probs and adding a Testing Unit by @TheAthleticCoder in #908
Roberta docstring reworking by @abuelnasr0 in #910
Speeding the tests for Albert by @soma2000-lang in #873
Mlm mask generator docstring adding example by @abuelnasr0 in #916
Don't save traces for saved model by @mattdangerw in #945
Bump stable tf version to 2.12 by @mattdangerw in #944
Speeding the tests for DistilBert by @soma2000-lang in #872
Allow BPE to treat special tokens as one token by @chenmoneygithub in #939
Edit examples in samplers by @abuelnasr0 in #957
Add RandomSampler to Samplers by @abuelnasr0 in #952
Add BartPreprocessor by @abheesht17 in #856
Remove the old sampler utilities by @mattdangerw in #948
Use direct imports everywhere in library by @mattdangerw in #961
Update docstrings for relocated sampler arg by @jbischof in #964
Fix gpt2, t5 and fnet under mixed precision by @mattdangerw in #958
Small fixes for special_tokens arg in BPE by @abheesht17 in #969
Add contrastive sampler by @chenmoneygithub in #896
Mark num_classes as required in Classifier classes by @chenmoneygithub in #971
Rework model docstrings for progressive disclosure of complexity for f_net by @ADITYADAS1999 in #879
Handle OOV token in XLMRoBERTaTokenizer's token_to_id function by @abheesht17 in #968
Clean up the docker and lint setup by @haifeng-jin in #981
Update generate() to work like fit() and predict() by @mattdangerw in #932
Speed top-p sampler up by only sampling from top-k tokens by @chenmoneygithub in #980
Expose the generate_step compilable function by @mattdangerw in #982
Fix decoder inputs in BART preprocessor by @abheesht17 in #984
Convert string tensors to python strings in generate() by @mattdangerw in #983
Adding a temperature argument to the base sampler class and related tests by @TheAthleticCoder in #951
Track the task preprocessor layer as part of model by @mattdangerw in #985
Add an XLMRobertaMaskedLM task model by @shivance in #950
Add an activation argument to all classifiers by @mattdangerw in #991
Remove activation from README quickstart by @mattdangerw in #992
Rework albert docstrings by @mattdangerw in #993
Rework bart docstrings by @mattdangerw in #994
Rework deberta docstrings by @mattdangerw in #995
Misc fixes to docstrings by @mattdangerw in #996
Added temperature argument to the Contrastive Sampler by @TheAthleticCoder in #997
Add OPTCausalLM and preprocessors by @chenmoneygithub in #990
Version bump to 0.5.0.dev0 by @chenmoneygithub in #1002
Add a flag to restrict which docstring tests run by @mattdangerw in #999
fix docstring for 0.5 release by @chenmoneygithub in #1005
Serialize activation fn properly by @mattdangerw in #1007
Try adding an error if activation and loss are mismatched by @mattdangerw in #1008
Fix docstring for 0.5 release by @chenmoneygithub in #1009
Switch to using pip_build for release by @mattdangerw in #1011
Make version number SSoT. by @fchollet in #827
Add DTensor layout map class method for OPT by @mattdangerw in #1000
Add DTensor layout map class method for GPT-2 by @mattdangerw in #1014
Standalone functions for generate pre/post processing for GPT-2 by @mattdangerw in #998
install namex in the publish workflow by @chenmoneygithub in #1020
Update publish-to-pypi.yml by @chenmoneygithub in #1021
Standalone functions for generate pre/post processing for OPT by @mattdangerw in #1015
Fix typos in export by @chenmoneygithub in #1024
Fix unclosed fenced docstrings by @mattdangerw in #1025
Fix a bug with computing the output mask after generate by @mattdangerw in #1029
small updates to the release doc by @chenmoneygithub in #1031
Sampler docstring edit by @abuelnasr0 in #1033
Fix program crash for id_to_token() method in SentencePieceTokenizer by @abuelnasr0 in #1040
Update our release process to preview docs before release by @mattdangerw in #1043
Add Whisper Tokenizer and Audio Feature Extractor by @abheesht17 in #847
Also strip padding token for opt by @mattdangerw in #1028
Add regex dep by @mattdangerw in #1044
Add BartSeq2SeqLM and conditional text generation with BART by @abheesht17 in #974
Support list/tuple inputs for special tokens in StartEndPacker layer by @abheesht17 in #1045
Support list/tuple inputs for special tokens in MultiSegmentPacker layer by @abheesht17 in #1046
Fix a misleading part of our cached MHA docs by @mattdangerw in #1048
Always pass weight name by kwarg by @mattdangerw in #1053
Always pass metrics in a list or dict by @mattdangerw in #1054
Move Defaults to to end of arg docstring and standardise values by @SamuelMarks in #1057
Fix beam search for BART by @abheesht17 in #1058
Replace tf.dtype with "dtype" by @mattdangerw in #1059
Test shapes directly by @mattdangerw in #1064
Clean up metrics tests by @mattdangerw in #1063
Remove metrics merge tests by @mattdangerw in #1065
Fix whisper feature inputs by @mattdangerw in #1069
Always specify shape when creating variables by @mattdangerw in #1067
Remove ragged support from position embeddings by @mattdangerw in #1068
Clean up dtype handling for preprocessing layers by @mattdangerw in #1066
Add BART finetuned on CNN+DM for summarisation by @abheesht17 in #1060
Fix saving bug by @mattdangerw in #1073
Fix t5 forward pass by @mattdangerw in #1082
Feat/make transformer decoder callable without causal mask by @ferraric in #1083
Adding GPTNeoXBackbone by @shivance in #1056
Add a common test case by @mattdangerw in #1095
Update register_keras_serializable to use saving module by @mattdangerw in #1094
Don't test tf format by @mattdangerw in #1104
Add GPTNeoXPreprocessor by @shivance in #1093
Split layers into layers/modeling & layers/preprocessing by @mattdangerw in #1102
Fix merge conflict from #1102 by @mattdangerw in #1105
Add a common base class for generative models by @mattdangerw in #1096
Add GPTNeoXCausalLMPreprocessor by @shivance in #1106
Add Whisper Presets by @abheesht17 in #1089
Refactor RotaryEmbedding and GPTNeoXAttention by @shivance in #1101
Remove all the secret keys for ci by @mattdangerw in #1126
Fix publish to pypi action by @mattdangerw in #1127
Update README for Keras Core by @jbischof in #1135
Ignore errors in UTF-8 decoding by @abheesht17 in #1150
Ports GPTNeoX to KerasCore by @shivance in #1137
Small fix for mixed precision generation on tf by @mattdangerw in #1153
Port DeBERTa to multi-backend by @abheesht17 in #1155
Change all tensors passed to tf.data.Dataset to numpy by @mattdangerw in #1161
Fix broken tests by @mattdangerw in #1163
Pin keras-core to 0.1.0 while investigating failures by @mattdangerw in #1168
Run GPU tests on Jax + Torch by @ianstenbit in #1160
Fix flakes in masked lm testing by removing any indeterminism by @mattdangerw in #1171
Always install the correct version with pip_build by @mattdangerw in #1174
Remove tests for preprocessing inside a functional model by @mattdangerw in #1175
Extend the timeout for large tests by @mattdangerw in #1103
Add GPTNeoXCausalLM by @shivance in #1110
Bump tensorflow to latest stable by @mattdangerw in #1170
Add compute_output_shape to tokenizer by @shivance in #1166
Stop pinning keras-core by @mattdangerw in #1178
Port FNet by @abheesht17 in #1164
Automate the update image flow by @mattdangerw in #1179
Restore mask_position argument name by @mattdangerw in #1185
Port contrastive sampler to multi-backend by @mattdangerw in #1187
Port BeamSampler to core by @shivance in #1181
Port metrics to multi-backend by @mattdangerw in #1186
Generic RotaryEmbedding Layer by @shivance in #1180
Raise ValueError when number of dims evaluate to zero by @sampathweb in #1198
Add XLNetBackbone by @susnato in #1084
Switch from tf.nest to dm-tree by @mattdangerw in #1199
Fix CI for keras-core 0.1.4 by @mattdangerw in #1202
Fix ModuleNotFoundError keras_nlp.models.xlnet by @shivance in #1204
Add support for "untied" embedding weights in language models by @mattdangerw in #1201
Add start_index argument to all position embedding layers by @mattdangerw in #1209
Remove windows line endings by @mattdangerw in #1210
Fix Autograph error with perplexity metric by @shivance in #1211
[JAX backend]: Fix errors with perplexity by @shivance in #1213
Improve layer naming consistency by @mattdangerw in #1219
Stop asserting key order in bart preprocessor by @mattdangerw in #1221
Remove file level docstrings by @mattdangerw in #1222
Fix typos by @mattdangerw in #1220
Typo fix by @mattdangerw in #1223
Fix RotaryEmbedding import by @shivance in #1217
Update transformer_decoder for the proper naming of the sublayers. by @qlzh727 in #1230
Replace tf with numpy by @mattdangerw in #1232
Update to always using ops.shape by @mattdangerw in #1231
Add a test harness based on keras-core's run_layer_test by @mattdangerw in #1238
fixed token_to_id doc + error msg by @jackd in #1240
Changed default TokenAndPositionEmbedding initializer to 'uniform' by @jackd in #1237
Add compat shims for the upcoming keras-core release by @mattdangerw in #1244
Depend on latest keras-core by @mattdangerw in #1246
Removed the undefined self.sequence_length by @sahusiddharth in #1245
Bump devcontainer to 3.9 by @mattdangerw in #1249
Add a mixed precision test and fix mixed precision errors for layers by @mattdangerw in #1242
Quick fix for 0.1.7 keras-core release by @mattdangerw in #1251
Small docstring fixes for the upcoming release by @mattdangerw in #1253
Don't export model internals publicly by @mattdangerw in #1255
Bump master branch version number to 0.7.0.dev0 by @mattdangerw in #1254
Fix/allow different encoder and decoder feature dimensions in transformer decoder layer by @ferraric in #1260
Doc updates to switch branding to Keras 3 by @mattdangerw in #1259
Remove unused TPU testing for backbones by @mattdangerw in #1266
Make gelu a function, not a lambda so it can be loaded without safe_mode=False by @calvingiles in #1262
Update requirements and install instructions for multi-backend keras by @mattdangerw in #1257
Support Keras 3 installation by @mattdangerw in #1258
Remove dtensor by @mattdangerw in #1268
Add a lora dense layer by @mattdangerw in #1263
Factor out testing routines for models by @mattdangerw in #1269
Convert T5 to Keras 3 by @nkovela1 in #1274
Fix missing backticks in DistilBertClassifier docstrings by @Philmod in #1278
T5 checkpoint conversion with HF by @nkovela1 in #1277
Use gelu_approximate directly in t5 presets by @mattdangerw in #1284
Add preset tests and weights URLs by @nkovela1 in #1285
Support loading keras 3 nightly by @mattdangerw in #1286
Remove the use of SentencePieceTrainer from tests by @tirthasheshpatel in #1283
Fix XLM-RoBERTa detokenize() by @abheesht17 in #1289
Correct tie_embedding_weights and add logit checking by @nkovela1 in #1288
Add detokenize testing for model tokenizers by @mattdangerw in #1290
Fix Whisper by @abheesht17 in #1287
Test against Keras 3 by @mattdangerw in #1273
Support TF_USE_LEGACY_KERAS by @mattdangerw in #1295
Run workflows with read-only tokens by @pnacht in #1305
Update CONTRIBUTING.md by @mattdangerw in #1310
Add GitHub Action for Nightly by @sampathweb in #1309
Fix the publish to pypi action by @mattdangerw in #1311
Fix nightly tf failure by @mattdangerw in #1316
Switch deberta to use the "int" dtype by @mattdangerw in #1315
Add security policy by @pnacht in #1319
Fix missing export for reversible embedding by @mattdangerw in #1327
Add version API to keras_nlp by @grasskin in #1324
Fix Keras 3 version check by @sampathweb in #1328
Simplify running KerasNLP with Keras 3 by @mattdangerw in #1308
Fix issues with version by @mattdangerw in #1332
Fix typo in whisper presets files by @mattdangerw in #1337
ELECTRA backbone implementation in keras by @pranavvp16 in #1291
Fix t5 tokenizer expected output by @mattdangerw in #1348
Add init.py for electra by @mattdangerw in #1352
Remove lora dense for now by @mattdangerw in #1359
Adds Kokoro Build script for Keras-NLP GPU tests by @sampathweb in #1355
Fixes GPU Test failures for Keras 3 by @sampathweb in #1361
Change Continuous config to also run only large tests by @sampathweb in #1362
ElectraTokenizer by @pranavvp16 in #1357
Add MistralAI's 7B Transformer as a backbone in KerasNLP Models by @tirthasheshpatel in #1314
changing pooling output by @mbrhd in #1364
Add LlamaBackbone by @shivance in #1203
Align pip_build with keras by @sampathweb in #1374
Remove cloudbuild config by @mattdangerw in #1375
Fix one last bad preset hash by @mattdangerw in #1381
Add a tokenizer for the Mistral backbone by @tirthasheshpatel in #1383
Kaggle Presets by @sampathweb in #1365
Fix mistral and electra tokenizer to match kaggle changes by @mattdangerw in #1387
Align requirments with Keras by @sampathweb in #1386
Add a preprocessor for the Mistral backbone by @tirthasheshpatel in #1385
Switch to always expect full Kaggle preset handles by @mattdangerw in #1390
Fix test for recent keras 3 change by @mattdangerw in #1400
Pass less state to jax generate function by @mattdangerw in #1398
Add llama tokenizer by @mattdangerw in #1401
Add Bloom Model by @abuelnasr0 in #1382
Try fixing tests by @mattdangerw in #1411
Revert "Pass less state to jax generate function (#1398)" by @mattdangerw in #1412
Bloom tokenizer by @abuelnasr0 in #1403
Update black formatting by @mattdangerw in #1415
Add Alibi bias layer by @abuelnasr0 in #1404
Pin to tensorflow-hub 0.16.0 to fix CI error by @sampathweb in #1420
Update TF Text and remove TF Hub deps by @sampathweb in #1423
Pin Jax Version in GPU CI by @sampathweb in #1430
Add Bloom preprocessor by @abuelnasr0 in #1424
Add layer attributes for all functional models by @mattdangerw in #1421
Allow setting dtype per model by @mattdangerw in #1431
Add a Causal LM model for Mistral by @tirthasheshpatel in #1429
Fix bart by @mattdangerw in #1434
Add a settable property for sequence_length by @mattdangerw in #1437
Add dependabot to update GH Actions and Python dependencies by @pnacht in #1380
Bump the github-actions group with 1 update by @dependabot in #1438
Add 7B presets for Mistral by @tirthasheshpatel in #1436
Update byte_pair_tokenizer.py to close merges file properly by @divyashreepathihalli in #1440
bump version to 0.8 by @mattdangerw in #1441
Update our sampler documentation to reflect usage by @mattdangerw in #1444
Add Gemma model by @mattdangerw in #1448
Update to the newest version of Gemma on Kaggle by @mattdangerw in #1454
Add dtype arg to Gemma HF conversion script by @nkovela1 in #1452
Fix gemma testing import by @mattdangerw in #1462
Add docstring for PyTorch conversion script install instructions by @nkovela1 in #1471
Add an annotation to tests that need kaggle auth by @mattdangerw in #1470
Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
Bump the master version to 0.9 by @mattdangerw in #1473
Pin to TF 2.16 RC0 by @sampathweb in #1478
Fix gemma rms_normalization's use of epsilon by @cpsauer in #1472
Add FalconBackbone by @SamanehSaadat in #1475
CI - Add kaggle creds to pull model by @sampathweb in #1459
bug in example for ReversibleEmbedding by @TheCrazyT in #1484
doc fix for constrastive sampler by @mattdangerw in #1488
Remove broken link to masking and padding guide by @mattdangerw in #1487
Fix a typo in causal_lm_preprocessors by @SamanehSaadat in #1489
Fix dtype accessors of tasks/backbones by @mattdangerw in #1486
Auto-labels 'gemma' on 'gemma' issues/PRs. by @shmishra99 in #1490
Add BloomCausalLM by @abuelnasr0 in #1467
Remove the bert jupyter conversion notebooks by @mattdangerw in #1492
Add FalconTokenizer by @SamanehSaadat in #1485
Add FalconPreprocessor by @SamanehSaadat in #1498
Rename 176B presets & Add other presets into bloom_presets.py by @abuelnasr0 in #1496
Add bloom presets by @abuelnasr0 in #1501
Create workflow for auto assignment of issues and for stale issues by @sachinprasadhs in #1495
Update requirements to TF 2.16 by @sampathweb in #1503
Expose Task and Backbone by @mattdangerw in #1506
Clean up and add our gemma conversion script by @mattdangerw in #1493
Don't auto-update JAX GPU by @sampathweb in #1507
Keep rope at float32 precision by @grasskin in #1497
Bump the python group with 2 updates by @dependabot in #1509
Fixes for the LLaMA backbone + add dropout by @tirthasheshpatel in #1499
Add LlamaPreprocessor and LlamaCausalLMPreprocessor by @tirthasheshpatel in #1511
Always run the rotary embedding layer in float32 by @tirthasheshpatel in #1508
CI: Fix psutil - Remove install of Python 3.9 and alias of python3 by @sampathweb in #1514
Update gemma_backbone.py for sharding config. by @qlzh727 in #1491
Docs/modelling layers by @mykolaskrynnyk in #1502
Standardize docstring by @sachinprasadhs in #1516
Support tokenization of special tokens for word_piece_tokenizer by @abuelnasr0 in #1397
Upload Model to Kaggle by @SamanehSaadat in #1512
Add scoring mode to MistralCausalLM by @RyanMullins in #1521
Add Mistral Instruct V0.2 preset by @tirthasheshpatel in #1520
Add Tests for Kaggle Upload Validation by @SamanehSaadat in #1524
Add presets for Electra and checkpoint conversion script by @pranavvp16 in #1384
Allow saving / loading from Huggingface Hub preset by @Wauplin in #1510
Stop on multiple end tokens by @grasskin in #1518
Fix doc: mistral_base_en -> mistral_7b_en by @asmith26 in #1528
Add lora example to GemmaCausalLM docstring by @SamanehSaadat in #1527
Add LLaMA Causal LM with 7B presets by @tirthasheshpatel in #1526
Add task base classes; support out of tree library extensions by @mattdangerw in #1517
Doc fixes by @mattdangerw in #1530
Run the LLaMA and Mistral RMS Layer Norm in float32 by @tirthasheshpatel in #1532
Adds score API to GPT-2 by @RyanMullins in #1533
increase pip timeout to 1000s to avoid connection resets by @sampathweb in #1535
Adds the score API to LlamaCausalLM by @RyanMullins in #1534
Implement compute_output_spec() for tokenizers with vocabulary. by @briango28 in #1523
Remove staggler type annotiations by @mattdangerw in #1536
Always run SiLU activation in float32 for LLaMA and Mistral by @tirthasheshpatel in #1540
Bump the python group with 2 updates by @dependabot in #1538
Disallow saving to preset from keras 2 by @SamanehSaadat in #1545
Fix the rotary embedding computation in LLaMA by @tirthasheshpatel in #1544
Fix re-compilation bugs by @mattdangerw in #1541
Fix preprocessor from_preset bug by @mattdangerw in #1549
Fix a strange issue with preprocessing layer output types by @mattdangerw in #1550
Fix lowercase bug in wordpiece tokenizer by @abuelnasr0 in #1543
Small docs updates by @mattdangerw in #1553
Add a few new preset for gemma by @mattdangerw in #1556
Fix the new stop_token_ids argument by @mattdangerw in #1558
Fix tests with the "auto" default for stop token ids by @mattdangerw in #1559
Fix print_fn issue in task test by @SamanehSaadat in #1563
Update presets for code gemma by @mattdangerw in #1564
Fix saving bug for untied weights with keras 3.2 by @mattdangerw in #1568
0.9 is out, nightly should be a preview of 0.10 now by @mattdangerw in #1570
Do the reverse embedding in the same dtype as the input embedding by @mattdangerw in #1548
Add support for positions array in keras_nlp.layers.RotaryEmbedding layer by @tirthasheshpatel in #1571
Support Task Saving/Loading by @SamanehSaadat in #1547
Improve error handling for non-keras model loading attempts by @SamanehSaadat in #1577
Add Model Card for Hugging Face Upload by @SamanehSaadat in #1578
Add Saving Tests by @SamanehSaadat in #1590
Improve error handling for missing TensorFlow dependency in keras_nlp. by @SamanehSaadat in #1585
Fix Keras import by @sampathweb in #1593
Check kagglehub version before upload by @SamanehSaadat in #1594
Change the order of importing keras by @james77777778 in #1596
Add backend info to HF model card by @SamanehSaadat in #1599
Bump required kagglehub version to 0.2.4 by @SamanehSaadat in #1600
Bump bert_tiny_en_uncased_sst2 classifier version by @SamanehSaadat in #1602
Allow a task preprocessor to be an argument in from_preset by @SamanehSaadat in #1603
API Generation by @sampathweb in #1608
Update readme with some recent changes by @mattdangerw in #1575
Bump the python group with 2 updates by @dependabot in #1611
Add CodeGemma 1.1 presets by @grasskin in #1617
Fix rope scaling factor by @abuelnasr0 in #1605
Fix the issue of propagating training argument in subclasses by @james77777778 in #1623
Pass kwargs to tokenizer when creating preprocessor by @SamanehSaadat in #1632
Add phi3 by @abuelnasr0 in #1597
Add LLaMA 3 tokenizer and preset by @tirthasheshpatel in #1584
Export missing llama 3 symbol by @mattdangerw in #1633
PaliGemma by @mattdangerw in #1636
Update pali_gemma_presets.py by @divyashreepathihalli in #1637
Update version to 0.13.0 for the master branch by @mattdangerw in #1640
Update llama3 preset versions by @mattdangerw in #1641
extra argument in save_to_preset method by @sineeli in #1634
Fix a typo in an error handling message by @SamanehSaadat in #1647
Fix a typo in phi3 metadata by @mattdangerw in #1646
Add FalconCausalLM by @SamanehSaadat in #1635
Add include rescaling to the pali gemma backbone by @mattdangerw in #1650
PaliGemma docstring fix by @mattdangerw in #1651
Fix newline characters for pali_gemma by @mattdangerw in #1655
Remove dead code by @mattdangerw in #1659
Fix some testing on the latest version of keras by @mattdangerw in #1663
Vicuna Models checkpoints transfer script by @sineeli in #1657
Add documented but missing methods for some tokenizers by @SamanehSaadat in #1664
Changed from_preset file downloading to use GFile when able by @VarunS1997 in #1665
Fix gfile downloads by @mattdangerw in #1666
More error handling for gfile by @mattdangerw in #1667
Update error message by @mattdangerw in #1668
Ditch Keras 2 support by @mattdangerw in #1658
fix GemmaBackbone.get_layout_map + test by @martin-gorner in #1669
Covert a safetensor checkpoint from Hugging Face hub by @ariG23498 in #1662
Add Gemma 2 model by @grasskin in #1673
Version bump to 0.14.0.dev0 by @grasskin in #1675
Revert "Version bump to 0.14.0.dev0" by @grasskin in #1676
Remove Keras pin, fix tests by @mattdangerw in #1681
Add quantization support for Gemma, Gemma2 and PaliGemma by @james77777778 in #1670
add vicuna preset by @sineeli in #1672
Porting Gemma 2 transformers checkpoint by @ariG23498 in #1678
Improve CI speed and resolve issues of run_quantization_check by @james77777778 in #1682
Remove build_from_signature from MHA layers by @mattdangerw in #1687
Refactoring: in CachedMultiHeadAttention call MHA methods instead of recoding the attention calculation by @apehex in #1684
Porting PaliGemma transformers checkpoint by @ariG23498 in #1686
Allow importing keras_nlp without tensorflow by @mattdangerw in #1660
Add flag to gemma conversion script to specify local orbax by @mattdangerw in #1688
Fix compatibility for earlier versions of Keras by @james77777778 in #1690
Add a test against keras-nightly by @mattdangerw in #1693
Fix dtype bugs in ReversibleEmbedding and LayerNorm by @james77777778 in #1692
Partially revert #1687 by @mattdangerw in #1695
Fix quantization test for XLNet by @james77777778 in #1699
Add a HF BERT converter, improve safetensor loading by @mattdangerw in #1694
Add a subtle fix for gemma 2 conversions by @mattdangerw in #1701
One more small Gemma conversion fix by @mattdangerw in #1702
Slightly more defensive handling of type for backbone by @mattdangerw in #1703
Add support for converting Gemma 2 checkpoints by @mattdangerw in #1700
Make it clearer what is running in the github action UI by @mattdangerw in #1707
Try upgrading tensorflow pin by @mattdangerw in #1706
Bump version to fix query norm in Gemma 2 9b by @mattdangerw in #1709
Gemma: Add logit soft-capping to score function. by @RyanMullins in #1712
Version bump HEAD to 0.15 by @mattdangerw in #1713
Port gpt2 transformers checkpoint by @cosmo3769 in #1704
Add soft capping to reversible embedding layer by @mattdangerw in #1718
Add presets for gemma 2 2b by @mattdangerw in #1721
Utilize to_numpy=True in quantize if available by @james77777778 in #1725
Dynamic int8 quantization for Llama2 and Llama3 by @james77777778 in #1720
Bump the python group with 2 updates by @dependabot in #1726
Shield gemma shortnames by @mattdangerw in #1731
Sliding window fixes by @mattdangerw in #1738
Add int8 models to Llama2 and Llama3 by @james77777778 in #1734
Port distilbert transformer checkpoint by @cosmo3769 in #1736
Add support of kwargs to Backbone.from_preset and fix the dtype forwarding in Task.from_preset by @james77777778 in #1742
Remove src init file contents by @mattdangerw in #1743
Remove ROADMAP.md by @mattdangerw in #1773
Fix nested list in args on keras.io by @mattdangerw in #1772
Remove stale tf only examples by @mattdangerw in #1771
Limit the default sequence length to 1024 for all models by @mattdangerw in #1770
Consistent preprocessing output on all backends by @mattdangerw in #1777
Port albert transformer checkpoint by @cosmo3769 in #1767
Lower the default learning rate for albert by @mattdangerw in #1786
Port bart transformer checkpoint by @cosmo3769 in #1783
Add an option to disable default compilation by @mattdangerw in #1787
Port mistral transformer checkpoint by @cosmo3769 in #1768
[Bart]Fix missing weight port by @cosmo3769 in #1789
Remove python 3.8 version in setup.py by @mattdangerw in #1792
Class detection works for huggingface checkpoints by @mattdangerw in #1800
Rename KerasNLP symbols for a multi-modal future by @mattdangerw in #1803
Move preprocessing to base classes by @mattdangerw in #1807
Add add_bos=False, add_eos=False to SentencePieceTokenizer.init() by @briango28 in #1811
Only load a full task config when load_task_extras is passed by @mattdangerw in #1812
Add image and audio converter classes by @mattdangerw in #1813
Simplify registering "built-in" presets by @mattdangerw in #1818
Support image and audio information in task summaries by @mattdangerw in #1819
Take two of #1812, simpler classifier head loading by @mattdangerw in #1823
Remove preprocessing layers we no longer use by @mattdangerw in #1824
Add missing aliases by @mattdangerw in #1828
Bump nightly and head package version to 0.16 by @mattdangerw in #1826
Fix post_attention_norm name in Gemma by @SamanehSaadat in #1834
Update README.md by @mattdangerw in #1837
Fix saved classifier models from before 0.14 by @mattdangerw in #1839
Fix device scope issues by @mattdangerw in #1841
Preprocessing decorator fixes by @mattdangerw in #1843
Keras hub rename by @mattdangerw in #1840
Update README.md by @mattdangerw in #1846
Add imagenet prediction decoder by @mattdangerw in #1848
Update links github links post rename by @mattdangerw in #1851
Add anchor_generator, box_matcher and non_max_supression by @sineeli in #1849
Keras nlp shim by @mattdangerw in #1853
Only publish KerasNLP if we have a release tag by @mattdangerw in #1854
Make keras-nlp-nightly shim depend on keras-hub-nightly shim by @mattdangerw in #1856
Version bump by @mattdangerw in #1857
Fix api_export.py by @divyashreepathihalli in #1858
Expunge include_rescaling from backbones by @mattdangerw in #1859
add SAM model by @divyashreepathihalli in #1847
Add StableDiffusion3 by @james77777778 in #1820
Weights densenet by @sachinprasadhs in #1855
Add StableDiffusion3 preset by @james77777778 in #1884
Remove copyright notices by @fchollet in #1882
update preprocessor docstring by @divyashreepathihalli in #1881
Update kokoro for new name by @mattdangerw in #1852
update README by @divyashreepathihalli in #1886
update jax version by @divyashreepathihalli in #1888
Remove remaining copyright mentions by @sachinprasadhs in #1889
[RetinaNet] Add FPN, RetinaNet label encoder as part of phase 1 by @sineeli in #1885
Add SAM weights conversion and preprocessor flow by @divyashreepathihalli in #1891
added "tie_word_emebeddings" setting necessary for Llama 3.2 by @martin-gorner in #1895
MixTransformer Argument Clarification by @DavidLandup0 in #1894
Update README by @divyashreepathihalli in #1890
Allow backbone to be any functional, preprocessor any callable by @mattdangerw in #1900
Update the conversion script to match preprocessor names and register presets by @divyashreepathihalli in #1902
Image classifier changes by @mattdangerw in #1901
Bump the python group with 2 updates by @dependabot in #1898
Add VAEBackbone and use it for SD3 by @james77777778 in #1892
Add Deeplabv3Plus and DeepLabV3 with segmentation by @sachinprasadhs in #1869
Support Gemma2 checkpoint conversion by @jeffcarp in #1905
BytePairTokenizer must not split sequences of \n by @martin-gorner in #1910
fix for generation that never stops in Llama3-Instruct variants by @martin-gorner in #1904
fix failing JAX GPU test by @divyashreepathihalli in #1911
Refactor MMDiT, add ImageToImage and Inpaint for SD3 by @james77777778 in #1909
Minor bug fix to image_converter.image_size by @sachinprasadhs in #1915
[Mix Transformer] Add Presets for MiTB0...MiTB5 by @DavidLandup0 in #1893
remove default resizing for vision backbones by @divyashreepathihalli in #1916
Update VGG model to be compatible with Timm weights and add conversion scripts by @jeffcarp in #1914
Deeplab presets by @sachinprasadhs in #1918
update presets to point to the main Keras Kaggle page by @divyashreepathihalli in #1921
Added test for the way BytePairTokenizer handles the \n\n sequence, which is important in Lama chat templates by @martin-gorner in #1912
Task models fix by @martin-gorner in #1922
adding option strip_prompt to generate() by @martin-gorner in #1913
Layout map for Llama by @martin-gorner in #1923
Update deeplab_v3_presets.py by @divyashreepathihalli in #1924
Add paths to get SAM weights from by @divyashreepathihalli in #1925
Two fixes for image resizing in preprocessing by @mattdangerw in #1927
add back default image resizing by @divyashreepathihalli in #1926
Update deeplab_v3_presets.py by @divyashreepathihalli in #1928
Update PaliGemma to remove include_rescaling arg by @divyashreepathihalli in #1917
fix path by @sachinprasadhs in #1929
Fix paligemma checkpoint conversion script by @divyashreepathihalli in #1931
update preset path to point to latest version of models by @divyashreepathihalli in #1932
Update sdv3 path by @sachinprasadhs in #1934
update sam docstring to show correct backbone in docstring by @divyashreepathihalli in #1936
Convert input dictionary to tensors during train_on_batch by @wenxindongwork in #1919
Register VGG presets. by @sachinprasadhs in #1935
Add ResNetVD presets by @gowthamkpr in #1897
Rename mix_tranformer to mit by @divyashreepathihalli in #1937
Fix docstrings and args default values by @divyashreepathihalli in #1938
Update sam_presets.py by @divyashreepathihalli in #1940
Update vit_det_backbone.py by @divyashreepathihalli in #1941
fix gpu test by @divyashreepathihalli in #1939
Added Support for Returning Attention Scores in TransformerEncoder call by @anirudhr20 in #1879
Mark preset tests as large by @divyashreepathihalli in #1942

New Contributors

@haifeng-jin made their first contribution in #618
@mbrukman made their first contribution in #628
@soma2000-lang made their first contribution in #665
@NusretOzates made their first contribution in #664
@shivance made their first contribution in #673
@Plutone11011 made their first contribution in #714
@TheAthleticCoder made their first contribution in #731
@Neeshamraghav012 made their first contribution in #734
@apupneja made their first contribution in #751
@Cyber-Machine made their first contribution in #690
@atharvapurdue made their first contribution in #781
@fchollet made their first contribution in #747
@susnato made their first contribution in #771
@jaygala223 made their first contribution in #805
@abodinier made their first contribution in #790
@Akorex made their first contribution in #853
@Warlord-K made their first contribution in #893
@Sruinard made their first contribution in #871
@abuelnasr0 made their first contribution in #882
@SamuelMarks made their first contribution in #1057
@ferraric made their first contribution in #1083
@ianstenbit made their first contribution in #1160
@qlzh727 made their first contribution in #1230
@jackd made their first contribution in #1240
@sahusiddharth made their first contribution in #1245
@calvingiles made their first contribution in #1262
@tirthasheshpatel made their first contribution in #1283
@pnacht made their first contribution in #1305
@pranavvp16 made their first contribution in #1291
@mbrhd made their first contribution in #1364
@dependabot made their first contribution in #1438
@cpsauer made their first contribution in #1472
@TheCrazyT made their first contribution in #1484
@shmishra99 made their first contribution in #1490
@mykolaskrynnyk made their first contribution in #1502
@RyanMullins made their first contribution in #1521
@Wauplin made their first contribution in #1510
@asmith26 made their first contribution in #1528
@briango28 made their first contribution in #1523
@VarunS1997 made their first contribution in #1665
@martin-gorner made their first contribution in #1669
@ariG23498 made their first contribution in #1662
@apehex made their first contribution in #1684
@cosmo3769 made their first contribution in #1704
@DavidLandup0 made their first contribution in #1894
@jeffcarp made their first contribution in #1905
@wenxindongwork made their first contribution in #1919
@anirudhr20 made their first contribution in #1879

Full Changelog: v0.4.0...v0.17.0.dev0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.16.0.dev0

Summary

What's Changed

New Contributors

Contributors