v0.16.0.dev0
Pre-release
Pre-release
divyashreepathihalli
released this
22 Oct 00:30
·
62 commits
to r0.17
since this release
Summary
- 📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
- Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
- A new preprocessor flow has been added for vision and audio models
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
self
in calls tosuper()
by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbone
base class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_dim
inTransformerDecoder
's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embedding
as a Backbone Property by @abheesht17 in #676 - Move
from_preset
to base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifier
by @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessor
by @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once end_token_id is seen by @chenmoneygithub in #769
- Added model card links for all pretrained models. by @Cyber-Machine in #795
- Initial PR demonstrating public API export logic. by @fchollet in #747
- Add preset for finetuning GPT2 on CNN news by @chenmoneygithub in #807
- Add API exports for metrics documented on keras.io by @shivance in #816
- Add API exports for samplers documented on keras.io by @shivance in #815
- Add API exports for models documented on keras.io by @shivance in #814
- Add API exports for tokenizers documented on keras.io by @shivance in #817
- Add API exports for layers documented on keras.io by @fchollet in #811
- Add keras_nlp.utils public API exports. by @fchollet in #819
- retrained bert_tiny_uncased_en_sst2_training.ipynb by @susnato in #771
- Temporary solution to avoid recompilation by @chenmoneygithub in #808
- Call super.config() in BartBackbone's get_config() by @shivance in #818
- Update typo in README.md by @ADITYADAS1999 in #821
- Add Whisper Backbone by @abheesht17 in #801
- Added note for tensorflow-text in the CONTRIBUTING guide by @jaygala223 in #805
- Roadmap update by @jaygala223 in #800
- Remove API export decorator from base classes by @shivance in #824
- Move integration tests out of repo sources. by @fchollet in #826
- Function merge_padding_and_attention_mask does not return an output with the desired shape when both padding and attention masks are given by @abodinier in #790
- Adding XXBackboneTPUTests by @shivance in #839
- Add a t5 tokenizer by @mattdangerw in #852
- Add compilation defaults for the BertMaskedLM task model by @ADITYADAS1999 in #836
- added init file for t5 by @Akorex in #853
- Modified Docstring for GPT2CasualLM by @TheAthleticCoder in #855
- Rework bert docstrings for progressive disclosure of complexity by @mattdangerw in #843
- Fix "causal" spelling in export decorator by @abheesht17 in #861
- Default compilation for Albert, Distilbert, Roberta MaskedLM by @shivance in #833
- Speed up default BERT testing roughly 3x by @mattdangerw in #859
- Add compilation defaults for the Fnet MaskedLM task model by @soma2000-lang in #834
- Default compilation for Debertav3MaskedLM model by @Cyber-Machine in #835
- Remove from_preset from fnet tokenizer by @mattdangerw in #865
- Add T5 backbone by @fchollet in #828
- Speeding the tests for opt by @susnato in #886
- Move generate compilation to the task model by @mattdangerw in #804
- Speeding the tests for xlm_roberta by @susnato in #885
- Rework DistilBERT docstrings for progressive disclosure of complexity. by @Cyber-Machine in #881
- Speeding the tests for T5 by @susnato in #888
- Rework OPT docstrings for progressive disclosure of complexity. by @Warlord-K in #893
- Get our fenced docstring tests working again by @mattdangerw in #895
- Speed up default RoBERTa testing roughly 3x by @shivance in #897
- Speeding the tests for whisper by @susnato in #887
- Update BytePairTokenizerCache to have similar dtypes for x and y in self.factors. by @Sruinard in #871
- Init
_backbone
,_tokenizer
and_preprocessor
in Task by @jbischof in #899 - Rework Whisper docstrings for progressive disclosure of complexity by @susnato in #903
- Speed up default DeBERTa_v3 testing roughly 3x by @TheAthleticCoder in #905
- Rework docstring of XLMRoberta by @abuelnasr0 in #882
- Stripping the MASK token by @TheAthleticCoder in #876
- Possible fix for task.summary() by @mattdangerw in #901
- Speed up default FNet testing speedups. by @Cyber-Machine in #894
- Added TPU test for DebertaV3Backbone by @TheAthleticCoder in #924
- Fix failing TPU tests by @chenmoneygithub in #931
- Add model contribution guide by @abheesht17 in #820
- Resolved roberta_checkpoint by @TheAthleticCoder in #874
- GLUE evaluation automation script by @susnato in #848
- Ensure shape in sample so that the shape is correct after TFLite conversion by @chenmoneygithub in #902
- Returning all Beams and Probs and adding a Testing Unit by @TheAthleticCoder in #908
- Roberta docstring reworking by @abuelnasr0 in #910
- Speeding the tests for Albert by @soma2000-lang in #873
- Mlm mask generator docstring adding example by @abuelnasr0 in #916
- Don't save traces for saved model by @mattdangerw in #945
- Bump stable tf version to 2.12 by @mattdangerw in #944
- Speeding the tests for DistilBert by @soma2000-lang in #872
- Allow BPE to treat special tokens as one token by @chenmoneygithub in #939
- Edit examples in samplers by @abuelnasr0 in #957
- Add RandomSampler to Samplers by @abuelnasr0 in #952
- Add BartPreprocessor by @abheesht17 in #856
- Remove the old sampler utilities by @mattdangerw in #948
- Use direct imports everywhere in library by @mattdangerw in #961
- Update docstrings for relocated
sampler
arg by @jbischof in #964 - Fix gpt2, t5 and fnet under mixed precision by @mattdangerw in #958
- Small fixes for special_tokens arg in BPE by @abheesht17 in #969
- Add contrastive sampler by @chenmoneygithub in #896
- Mark num_classes as required in Classifier classes by @chenmoneygithub in #971
- Rework model docstrings for progressive disclosure of complexity for f_net by @ADITYADAS1999 in #879
- Handle OOV token in XLMRoBERTaTokenizer's token_to_id function by @abheesht17 in #968
- Clean up the docker and lint setup by @haifeng-jin in #981
- Update generate() to work like fit() and predict() by @mattdangerw in #932
- Speed top-p sampler up by only sampling from top-k tokens by @chenmoneygithub in #980
- Expose the generate_step compilable function by @mattdangerw in #982
- Fix decoder inputs in BART preprocessor by @abheesht17 in #984
- Convert string tensors to python strings in
generate()
by @mattdangerw in #983 - Adding a temperature argument to the base sampler class and related tests by @TheAthleticCoder in #951
- Track the task preprocessor layer as part of model by @mattdangerw in #985
- Add an XLMRobertaMaskedLM task model by @shivance in #950
- Add an activation argument to all classifiers by @mattdangerw in #991
- Remove activation from README quickstart by @mattdangerw in #992
- Rework albert docstrings by @mattdangerw in #993
- Rework bart docstrings by @mattdangerw in #994
- Rework deberta docstrings by @mattdangerw in #995
- Misc fixes to docstrings by @mattdangerw in #996
- Added temperature argument to the Contrastive Sampler by @TheAthleticCoder in #997
- Add
OPTCausalLM
and preprocessors by @chenmoneygithub in #990 - Version bump to 0.5.0.dev0 by @chenmoneygithub in #1002
- Add a flag to restrict which docstring tests run by @mattdangerw in #999
- fix docstring for 0.5 release by @chenmoneygithub in #1005
- Serialize activation fn properly by @mattdangerw in #1007
- Try adding an error if activation and loss are mismatched by @mattdangerw in #1008
- Fix docstring for 0.5 release by @chenmoneygithub in #1009
- Switch to using pip_build for release by @mattdangerw in #1011
- Make version number SSoT. by @fchollet in #827
- Add DTensor layout map class method for OPT by @mattdangerw in #1000
- Add DTensor layout map class method for GPT-2 by @mattdangerw in #1014
- Standalone functions for generate pre/post processing for GPT-2 by @mattdangerw in #998
- install namex in the publish workflow by @chenmoneygithub in #1020
- Update publish-to-pypi.yml by @chenmoneygithub in #1021
- Standalone functions for generate pre/post processing for OPT by @mattdangerw in #1015
- Fix typos in export by @chenmoneygithub in #1024
- Fix unclosed fenced docstrings by @mattdangerw in #1025
- Fix a bug with computing the output mask after generate by @mattdangerw in #1029
- small updates to the release doc by @chenmoneygithub in #1031
- Sampler docstring edit by @abuelnasr0 in #1033
- Fix program crash for id_to_token() method in SentencePieceTokenizer by @abuelnasr0 in #1040
- Update our release process to preview docs before release by @mattdangerw in #1043
- Add Whisper Tokenizer and Audio Feature Extractor by @abheesht17 in #847
- Also strip padding token for opt by @mattdangerw in #1028
- Add regex dep by @mattdangerw in #1044
- Add BartSeq2SeqLM and conditional text generation with BART by @abheesht17 in #974
- Support list/tuple inputs for special tokens in StartEndPacker layer by @abheesht17 in #1045
- Support list/tuple inputs for special tokens in MultiSegmentPacker layer by @abheesht17 in #1046
- Fix a misleading part of our cached MHA docs by @mattdangerw in #1048
- Always pass weight name by kwarg by @mattdangerw in #1053
- Always pass metrics in a list or dict by @mattdangerw in #1054
- Move
Defaults to
to end of arg docstring and standardise values by @SamuelMarks in #1057 - Fix beam search for BART by @abheesht17 in #1058
- Replace tf.dtype with "dtype" by @mattdangerw in #1059
- Test shapes directly by @mattdangerw in #1064
- Clean up metrics tests by @mattdangerw in #1063
- Remove metrics merge tests by @mattdangerw in #1065
- Fix whisper feature inputs by @mattdangerw in #1069
- Always specify shape when creating variables by @mattdangerw in #1067
- Remove ragged support from position embeddings by @mattdangerw in #1068
- Clean up dtype handling for preprocessing layers by @mattdangerw in #1066
- Add BART finetuned on CNN+DM for summarisation by @abheesht17 in #1060
- Fix saving bug by @mattdangerw in #1073
- Fix t5 forward pass by @mattdangerw in #1082
- Feat/make transformer decoder callable without causal mask by @ferraric in #1083
- Adding
GPTNeoXBackbone
by @shivance in #1056 - Add a common test case by @mattdangerw in #1095
- Update register_keras_serializable to use saving module by @mattdangerw in #1094
- Don't test tf format by @mattdangerw in #1104
- Add
GPTNeoXPreprocessor
by @shivance in #1093 - Split layers into layers/modeling & layers/preprocessing by @mattdangerw in #1102
- Fix merge conflict from #1102 by @mattdangerw in #1105
- Add a common base class for generative models by @mattdangerw in #1096
- Add
GPTNeoXCausalLMPreprocessor
by @shivance in #1106 - Add Whisper Presets by @abheesht17 in #1089
- Refactor
RotaryEmbedding
andGPTNeoXAttention
by @shivance in #1101 - Remove all the secret keys for ci by @mattdangerw in #1126
- Fix publish to pypi action by @mattdangerw in #1127
- Update README for Keras Core by @jbischof in #1135
- Ignore errors in UTF-8 decoding by @abheesht17 in #1150
- Ports GPTNeoX to KerasCore by @shivance in #1137
- Small fix for mixed precision generation on tf by @mattdangerw in #1153
- Port DeBERTa to multi-backend by @abheesht17 in #1155
- Change all tensors passed to tf.data.Dataset to numpy by @mattdangerw in #1161
- Fix broken tests by @mattdangerw in #1163
- Pin keras-core to 0.1.0 while investigating failures by @mattdangerw in #1168
- Run GPU tests on Jax + Torch by @ianstenbit in #1160
- Fix flakes in masked lm testing by removing any indeterminism by @mattdangerw in #1171
- Always install the correct version with pip_build by @mattdangerw in #1174
- Remove tests for preprocessing inside a functional model by @mattdangerw in #1175
- Extend the timeout for large tests by @mattdangerw in #1103
- Add
GPTNeoXCausalLM
by @shivance in #1110 - Bump tensorflow to latest stable by @mattdangerw in #1170
- Add compute_output_shape to tokenizer by @shivance in #1166
- Stop pinning keras-core by @mattdangerw in #1178
- Port FNet by @abheesht17 in #1164
- Automate the update image flow by @mattdangerw in #1179
- Restore mask_position argument name by @mattdangerw in #1185
- Port contrastive sampler to multi-backend by @mattdangerw in #1187
- Port
BeamSampler
to core by @shivance in #1181 - Port metrics to multi-backend by @mattdangerw in #1186
- Generic
RotaryEmbedding
Layer by @shivance in #1180 - Raise ValueError when number of dims evaluate to zero by @sampathweb in #1198
- Add XLNetBackbone by @susnato in #1084
- Switch from tf.nest to dm-tree by @mattdangerw in #1199
- Fix CI for keras-core 0.1.4 by @mattdangerw in #1202
- Fix ModuleNotFoundError
keras_nlp.models.xlnet
by @shivance in #1204 - Add support for "untied" embedding weights in language models by @mattdangerw in #1201
- Add start_index argument to all position embedding layers by @mattdangerw in #1209
- Remove windows line endings by @mattdangerw in #1210
- Fix Autograph error with perplexity metric by @shivance in #1211
- [JAX backend]: Fix errors with perplexity by @shivance in #1213
- Improve layer naming consistency by @mattdangerw in #1219
- Stop asserting key order in bart preprocessor by @mattdangerw in #1221
- Remove file level docstrings by @mattdangerw in #1222
- Fix typos by @mattdangerw in #1220
- Typo fix by @mattdangerw in #1223
- Fix RotaryEmbedding import by @shivance in #1217
- Update transformer_decoder for the proper naming of the sublayers. by @qlzh727 in #1230
- Replace tf with numpy by @mattdangerw in #1232
- Update to always using ops.shape by @mattdangerw in #1231
- Add a test harness based on keras-core's
run_layer_test
by @mattdangerw in #1238 - fixed token_to_id doc + error msg by @jackd in #1240
- Changed default TokenAndPositionEmbedding initializer to 'uniform' by @jackd in #1237
- Add compat shims for the upcoming keras-core release by @mattdangerw in #1244
- Depend on latest keras-core by @mattdangerw in #1246
- Removed the undefined self.sequence_length by @sahusiddharth in #1245
- Bump devcontainer to 3.9 by @mattdangerw in #1249
- Add a mixed precision test and fix mixed precision errors for layers by @mattdangerw in #1242
- Quick fix for 0.1.7 keras-core release by @mattdangerw in #1251
- Small docstring fixes for the upcoming release by @mattdangerw in #1253
- Don't export model internals publicly by @mattdangerw in #1255
- Bump master branch version number to 0.7.0.dev0 by @mattdangerw in #1254
- Fix/allow different encoder and decoder feature dimensions in transformer decoder layer by @ferraric in #1260
- Doc updates to switch branding to Keras 3 by @mattdangerw in #1259
- Remove unused TPU testing for backbones by @mattdangerw in #1266
- Make gelu a function, not a lambda so it can be loaded without safe_mode=False by @calvingiles in #1262
- Update requirements and install instructions for multi-backend keras by @mattdangerw in #1257
- Support Keras 3 installation by @mattdangerw in #1258
- Remove dtensor by @mattdangerw in #1268
- Add a lora dense layer by @mattdangerw in #1263
- Factor out testing routines for models by @mattdangerw in #1269
- Convert T5 to Keras 3 by @nkovela1 in #1274
- Fix missing backticks in DistilBertClassifier docstrings by @Philmod in #1278
- T5 checkpoint conversion with HF by @nkovela1 in #1277
- Use gelu_approximate directly in t5 presets by @mattdangerw in #1284
- Add preset tests and weights URLs by @nkovela1 in #1285
- Support loading keras 3 nightly by @mattdangerw in #1286
- Remove the use of
SentencePieceTrainer
from tests by @tirthasheshpatel in #1283 - Fix XLM-RoBERTa detokenize() by @abheesht17 in #1289
- Correct tie_embedding_weights and add logit checking by @nkovela1 in #1288
- Add detokenize testing for model tokenizers by @mattdangerw in #1290
- Fix Whisper by @abheesht17 in #1287
- Test against Keras 3 by @mattdangerw in #1273
- Support TF_USE_LEGACY_KERAS by @mattdangerw in #1295
- Run workflows with read-only tokens by @pnacht in #1305
- Update CONTRIBUTING.md by @mattdangerw in #1310
- Add GitHub Action for Nightly by @sampathweb in #1309
- Fix the publish to pypi action by @mattdangerw in #1311
- Fix nightly tf failure by @mattdangerw in #1316
- Switch deberta to use the "int" dtype by @mattdangerw in #1315
- Add security policy by @pnacht in #1319
- Fix missing export for reversible embedding by @mattdangerw in #1327
- Add
version
API to keras_nlp by @grasskin in #1324 - Fix Keras 3 version check by @sampathweb in #1328
- Simplify running KerasNLP with Keras 3 by @mattdangerw in #1308
- Fix issues with version by @mattdangerw in #1332
- Fix typo in whisper presets files by @mattdangerw in #1337
ELECTRA
backbone implementation in keras by @pranavvp16 in #1291- Fix t5 tokenizer expected output by @mattdangerw in #1348
- Add init.py for electra by @mattdangerw in #1352
- Remove lora dense for now by @mattdangerw in #1359
- Adds Kokoro Build script for Keras-NLP GPU tests by @sampathweb in #1355
- Fixes GPU Test failures for Keras 3 by @sampathweb in #1361
- Change Continuous config to also run only large tests by @sampathweb in #1362
- ElectraTokenizer by @pranavvp16 in #1357
- Add MistralAI's 7B Transformer as a backbone in KerasNLP Models by @tirthasheshpatel in #1314
- changing pooling output by @mbrhd in #1364
- Add
LlamaBackbone
by @shivance in #1203 - Align pip_build with keras by @sampathweb in #1374
- Remove cloudbuild config by @mattdangerw in #1375
- Fix one last bad preset hash by @mattdangerw in #1381
- Add a tokenizer for the Mistral backbone by @tirthasheshpatel in #1383
- Kaggle Presets by @sampathweb in #1365
- Fix mistral and electra tokenizer to match kaggle changes by @mattdangerw in #1387
- Align requirments with Keras by @sampathweb in #1386
- Add a preprocessor for the Mistral backbone by @tirthasheshpatel in #1385
- Switch to always expect full Kaggle preset handles by @mattdangerw in #1390
- Fix test for recent keras 3 change by @mattdangerw in #1400
- Pass less state to jax generate function by @mattdangerw in #1398
- Add llama tokenizer by @mattdangerw in #1401
- Add Bloom Model by @abuelnasr0 in #1382
- Try fixing tests by @mattdangerw in #1411
- Revert "Pass less state to jax generate function (#1398)" by @mattdangerw in #1412
- Bloom tokenizer by @abuelnasr0 in #1403
- Update black formatting by @mattdangerw in #1415
- Add Alibi bias layer by @abuelnasr0 in #1404
- Pin to
tensorflow-hub 0.16.0
to fix CI error by @sampathweb in #1420 - Update TF Text and remove TF Hub deps by @sampathweb in #1423
- Pin Jax Version in GPU CI by @sampathweb in #1430
- Add Bloom preprocessor by @abuelnasr0 in #1424
- Add layer attributes for all functional models by @mattdangerw in #1421
- Allow setting dtype per model by @mattdangerw in #1431
- Add a Causal LM model for Mistral by @tirthasheshpatel in #1429
- Fix bart by @mattdangerw in #1434
- Add a settable property for sequence_length by @mattdangerw in #1437
- Add dependabot to update GH Actions and Python dependencies by @pnacht in #1380
- Bump the github-actions group with 1 update by @dependabot in #1438
- Add 7B presets for Mistral by @tirthasheshpatel in #1436
- Update byte_pair_tokenizer.py to close merges file properly by @divyashreepathihalli in #1440
- bump version to 0.8 by @mattdangerw in #1441
- Update our sampler documentation to reflect usage by @mattdangerw in #1444
- Add Gemma model by @mattdangerw in #1448
- Update to the newest version of Gemma on Kaggle by @mattdangerw in #1454
- Add dtype arg to Gemma HF conversion script by @nkovela1 in #1452
- Fix gemma testing import by @mattdangerw in #1462
- Add docstring for PyTorch conversion script install instructions by @nkovela1 in #1471
- Add an annotation to tests that need kaggle auth by @mattdangerw in #1470
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Bump the master version to 0.9 by @mattdangerw in #1473
- Pin to TF 2.16 RC0 by @sampathweb in #1478
- Fix gemma rms_normalization's use of epsilon by @cpsauer in #1472
- Add
FalconBackbone
by @SamanehSaadat in #1475 - CI - Add kaggle creds to pull model by @sampathweb in #1459
- bug in example for ReversibleEmbedding by @TheCrazyT in #1484
- doc fix for constrastive sampler by @mattdangerw in #1488
- Remove broken link to masking and padding guide by @mattdangerw in #1487
- Fix a typo in causal_lm_preprocessors by @SamanehSaadat in #1489
- Fix dtype accessors of tasks/backbones by @mattdangerw in #1486
- Auto-labels 'gemma' on 'gemma' issues/PRs. by @shmishra99 in #1490
- Add BloomCausalLM by @abuelnasr0 in #1467
- Remove the bert jupyter conversion notebooks by @mattdangerw in #1492
- Add
FalconTokenizer
by @SamanehSaadat in #1485 - Add
FalconPreprocessor
by @SamanehSaadat in #1498 - Rename 176B presets & Add other presets into bloom_presets.py by @abuelnasr0 in #1496
- Add bloom presets by @abuelnasr0 in #1501
- Create workflow for auto assignment of issues and for stale issues by @sachinprasadhs in #1495
- Update requirements to TF 2.16 by @sampathweb in #1503
- Expose Task and Backbone by @mattdangerw in #1506
- Clean up and add our gemma conversion script by @mattdangerw in #1493
- Don't auto-update JAX GPU by @sampathweb in #1507
- Keep rope at float32 precision by @grasskin in #1497
- Bump the python group with 2 updates by @dependabot in #1509
- Fixes for the LLaMA backbone + add dropout by @tirthasheshpatel in #1499
- Add
LlamaPreprocessor
andLlamaCausalLMPreprocessor
by @tirthasheshpatel in #1511 - Always run the rotary embedding layer in float32 by @tirthasheshpatel in #1508
- CI: Fix psutil - Remove install of Python 3.9 and alias of python3 by @sampathweb in #1514
- Update gemma_backbone.py for sharding config. by @qlzh727 in #1491
- Docs/modelling layers by @mykolaskrynnyk in #1502
- Standardize docstring by @sachinprasadhs in #1516
- Support tokenization of special tokens for word_piece_tokenizer by @abuelnasr0 in #1397
- Upload Model to Kaggle by @SamanehSaadat in #1512
- Add scoring mode to MistralCausalLM by @RyanMullins in #1521
- Add Mistral Instruct V0.2 preset by @tirthasheshpatel in #1520
- Add Tests for Kaggle Upload Validation by @SamanehSaadat in #1524
- Add presets for Electra and checkpoint conversion script by @pranavvp16 in #1384
- Allow saving / loading from Huggingface Hub preset by @Wauplin in #1510
- Stop on multiple end tokens by @grasskin in #1518
- Fix doc:
mistral_base_en
->mistral_7b_en
by @asmith26 in #1528 - Add lora example to GemmaCausalLM docstring by @SamanehSaadat in #1527
- Add LLaMA Causal LM with 7B presets by @tirthasheshpatel in #1526
- Add task base classes; support out of tree library extensions by @mattdangerw in #1517
- Doc fixes by @mattdangerw in #1530
- Run the LLaMA and Mistral RMS Layer Norm in float32 by @tirthasheshpatel in #1532
- Adds score API to GPT-2 by @RyanMullins in #1533
- increase pip timeout to 1000s to avoid connection resets by @sampathweb in #1535
- Adds the score API to LlamaCausalLM by @RyanMullins in #1534
- Implement compute_output_spec() for tokenizers with vocabulary. by @briango28 in #1523
- Remove staggler type annotiations by @mattdangerw in #1536
- Always run SiLU activation in float32 for LLaMA and Mistral by @tirthasheshpatel in #1540
- Bump the python group with 2 updates by @dependabot in #1538
- Disallow saving to preset from keras 2 by @SamanehSaadat in #1545
- Fix the rotary embedding computation in LLaMA by @tirthasheshpatel in #1544
- Fix re-compilation bugs by @mattdangerw in #1541
- Fix preprocessor from_preset bug by @mattdangerw in #1549
- Fix a strange issue with preprocessing layer output types by @mattdangerw in #1550
- Fix lowercase bug in wordpiece tokenizer by @abuelnasr0 in #1543
- Small docs updates by @mattdangerw in #1553
- Add a few new preset for gemma by @mattdangerw in #1556
- Fix the new stop_token_ids argument by @mattdangerw in #1558
- Fix tests with the "auto" default for stop token ids by @mattdangerw in #1559
- Fix
print_fn
issue in task test by @SamanehSaadat in #1563 - Update presets for code gemma by @mattdangerw in #1564
- Fix saving bug for untied weights with keras 3.2 by @mattdangerw in #1568
- 0.9 is out, nightly should be a preview of 0.10 now by @mattdangerw in #1570
- Do the reverse embedding in the same dtype as the input embedding by @mattdangerw in #1548
- Add support for positions array in
keras_nlp.layers.RotaryEmbedding
layer by @tirthasheshpatel in #1571 - Support Task Saving/Loading by @SamanehSaadat in #1547
- Improve error handling for non-keras model loading attempts by @SamanehSaadat in #1577
- Add Model Card for Hugging Face Upload by @SamanehSaadat in #1578
- Add Saving Tests by @SamanehSaadat in #1590
- Improve error handling for missing TensorFlow dependency in keras_nlp. by @SamanehSaadat in #1585
- Fix Keras import by @sampathweb in #1593
- Check kagglehub version before upload by @SamanehSaadat in #1594
- Change the order of importing
keras
by @james77777778 in #1596 - Add backend info to HF model card by @SamanehSaadat in #1599
- Bump required kagglehub version to 0.2.4 by @SamanehSaadat in #1600
- Bump
bert_tiny_en_uncased_sst2
classifier version by @SamanehSaadat in #1602 - Allow a task preprocessor to be an argument in from_preset by @SamanehSaadat in #1603
- API Generation by @sampathweb in #1608
- Update readme with some recent changes by @mattdangerw in #1575
- Bump the python group with 2 updates by @dependabot in #1611
- Add CodeGemma 1.1 presets by @grasskin in #1617
- Fix rope scaling factor by @abuelnasr0 in #1605
- Fix the issue of propagating
training
argument in subclasses by @james77777778 in #1623 - Pass kwargs to tokenizer when creating preprocessor by @SamanehSaadat in #1632
- Add phi3 by @abuelnasr0 in #1597
- Add LLaMA 3 tokenizer and preset by @tirthasheshpatel in #1584
- Export missing llama 3 symbol by @mattdangerw in #1633
- PaliGemma by @mattdangerw in #1636
- Update pali_gemma_presets.py by @divyashreepathihalli in #1637
- Update version to 0.13.0 for the master branch by @mattdangerw in #1640
- Update llama3 preset versions by @mattdangerw in #1641
- extra argument in save_to_preset method by @sineeli in #1634
- Fix a typo in an error handling message by @SamanehSaadat in #1647
- Fix a typo in phi3 metadata by @mattdangerw in #1646
- Add
FalconCausalLM
by @SamanehSaadat in #1635 - Add include rescaling to the pali gemma backbone by @mattdangerw in #1650
- PaliGemma docstring fix by @mattdangerw in #1651
- Fix newline characters for pali_gemma by @mattdangerw in #1655
- Remove dead code by @mattdangerw in #1659
- Fix some testing on the latest version of keras by @mattdangerw in #1663
- Vicuna Models checkpoints transfer script by @sineeli in #1657
- Add documented but missing methods for some tokenizers by @SamanehSaadat in #1664
- Changed from_preset file downloading to use GFile when able by @VarunS1997 in #1665
- Fix gfile downloads by @mattdangerw in #1666
- More error handling for gfile by @mattdangerw in #1667
- Update error message by @mattdangerw in #1668
- Ditch Keras 2 support by @mattdangerw in #1658
- fix GemmaBackbone.get_layout_map + test by @martin-gorner in #1669
- Covert a
safetensor
checkpoint from Hugging Face hub by @ariG23498 in #1662 - Add Gemma 2 model by @grasskin in #1673
- Version bump to 0.14.0.dev0 by @grasskin in #1675
- Revert "Version bump to 0.14.0.dev0" by @grasskin in #1676
- Remove Keras pin, fix tests by @mattdangerw in #1681
- Add quantization support for
Gemma
,Gemma2
andPaliGemma
by @james77777778 in #1670 - add vicuna preset by @sineeli in #1672
- Porting Gemma 2 transformers checkpoint by @ariG23498 in #1678
- Improve CI speed and resolve issues of
run_quantization_check
by @james77777778 in #1682 - Remove build_from_signature from MHA layers by @mattdangerw in #1687
- Refactoring: in CachedMultiHeadAttention call MHA methods instead of recoding the attention calculation by @apehex in #1684
- Porting PaliGemma transformers checkpoint by @ariG23498 in #1686
- Allow importing keras_nlp without tensorflow by @mattdangerw in #1660
- Add flag to gemma conversion script to specify local orbax by @mattdangerw in #1688
- Fix compatibility for earlier versions of Keras by @james77777778 in #1690
- Add a test against keras-nightly by @mattdangerw in #1693
- Fix dtype bugs in
ReversibleEmbedding
andLayerNorm
by @james77777778 in #1692 - Partially revert #1687 by @mattdangerw in #1695
- Fix quantization test for
XLNet
by @james77777778 in #1699 - Add a HF BERT converter, improve safetensor loading by @mattdangerw in #1694
- Add a subtle fix for gemma 2 conversions by @mattdangerw in #1701
- One more small Gemma conversion fix by @mattdangerw in #1702
- Slightly more defensive handling of type for backbone by @mattdangerw in #1703
- Add support for converting Gemma 2 checkpoints by @mattdangerw in #1700
- Make it clearer what is running in the github action UI by @mattdangerw in #1707
- Try upgrading tensorflow pin by @mattdangerw in #1706
- Bump version to fix query norm in Gemma 2 9b by @mattdangerw in #1709
- Gemma: Add logit soft-capping to score function. by @RyanMullins in #1712
- Version bump HEAD to 0.15 by @mattdangerw in #1713
- Port gpt2 transformers checkpoint by @cosmo3769 in #1704
- Add soft capping to reversible embedding layer by @mattdangerw in #1718
- Add presets for gemma 2 2b by @mattdangerw in #1721
- Utilize
to_numpy=True
inquantize
if available by @james77777778 in #1725 - Dynamic int8 quantization for Llama2 and Llama3 by @james77777778 in #1720
- Bump the python group with 2 updates by @dependabot in #1726
- Shield gemma shortnames by @mattdangerw in #1731
- Sliding window fixes by @mattdangerw in #1738
- Add int8 models to Llama2 and Llama3 by @james77777778 in #1734
- Port distilbert transformer checkpoint by @cosmo3769 in #1736
- Add support of
kwargs
toBackbone.from_preset
and fix the dtype forwarding inTask.from_preset
by @james77777778 in #1742 - Remove src init file contents by @mattdangerw in #1743
- Remove ROADMAP.md by @mattdangerw in #1773
- Fix nested list in args on keras.io by @mattdangerw in #1772
- Remove stale tf only examples by @mattdangerw in #1771
- Limit the default sequence length to 1024 for all models by @mattdangerw in #1770
- Consistent preprocessing output on all backends by @mattdangerw in #1777
- Port albert transformer checkpoint by @cosmo3769 in #1767
- Lower the default learning rate for albert by @mattdangerw in #1786
- Port bart transformer checkpoint by @cosmo3769 in #1783
- Add an option to disable default compilation by @mattdangerw in #1787
- Port mistral transformer checkpoint by @cosmo3769 in #1768
- [Bart]Fix missing weight port by @cosmo3769 in #1789
- Remove python 3.8 version in setup.py by @mattdangerw in #1792
- Class detection works for huggingface checkpoints by @mattdangerw in #1800
- Rename KerasNLP symbols for a multi-modal future by @mattdangerw in #1803
- Move preprocessing to base classes by @mattdangerw in #1807
- Add
add_bos=False, add_eos=False
to SentencePieceTokenizer.init() by @briango28 in #1811 - Only load a full task config when
load_task_extras
is passed by @mattdangerw in #1812 - Add image and audio converter classes by @mattdangerw in #1813
- Simplify registering "built-in" presets by @mattdangerw in #1818
- Support image and audio information in task summaries by @mattdangerw in #1819
- Take two of #1812, simpler classifier head loading by @mattdangerw in #1823
- Remove preprocessing layers we no longer use by @mattdangerw in #1824
- Add missing aliases by @mattdangerw in #1828
- Bump nightly and head package version to 0.16 by @mattdangerw in #1826
- Fix
post_attention_norm
name in Gemma by @SamanehSaadat in #1834 - Update README.md by @mattdangerw in #1837
- Fix saved classifier models from before 0.14 by @mattdangerw in #1839
- Fix device scope issues by @mattdangerw in #1841
- Preprocessing decorator fixes by @mattdangerw in #1843
- Keras hub rename by @mattdangerw in #1840
- Update README.md by @mattdangerw in #1846
- Add imagenet prediction decoder by @mattdangerw in #1848
- Update links github links post rename by @mattdangerw in #1851
- Add anchor_generator, box_matcher and non_max_supression by @sineeli in #1849
- Keras nlp shim by @mattdangerw in #1853
- Only publish KerasNLP if we have a release tag by @mattdangerw in #1854
- Make keras-nlp-nightly shim depend on keras-hub-nightly shim by @mattdangerw in #1856
- Version bump by @mattdangerw in #1857
- Fix api_export.py by @divyashreepathihalli in #1858
- Expunge include_rescaling from backbones by @mattdangerw in #1859
- add SAM model by @divyashreepathihalli in #1847
- Add StableDiffusion3 by @james77777778 in #1820
- Weights densenet by @sachinprasadhs in #1855
- Add StableDiffusion3 preset by @james77777778 in #1884
- Remove copyright notices by @fchollet in #1882
- update preprocessor docstring by @divyashreepathihalli in #1881
- Update kokoro for new name by @mattdangerw in #1852
- update README by @divyashreepathihalli in #1886
- update jax version by @divyashreepathihalli in #1888
- Remove remaining copyright mentions by @sachinprasadhs in #1889
- [RetinaNet] Add FPN, RetinaNet label encoder as part of phase 1 by @sineeli in #1885
- Add SAM weights conversion and preprocessor flow by @divyashreepathihalli in #1891
- added "tie_word_emebeddings" setting necessary for Llama 3.2 by @martin-gorner in #1895
- MixTransformer Argument Clarification by @DavidLandup0 in #1894
- Update README by @divyashreepathihalli in #1890
- Allow backbone to be any functional, preprocessor any callable by @mattdangerw in #1900
- Update the conversion script to match preprocessor names and register presets by @divyashreepathihalli in #1902
- Image classifier changes by @mattdangerw in #1901
- Bump the python group with 2 updates by @dependabot in #1898
- Add
VAEBackbone
and use it for SD3 by @james77777778 in #1892 - Add Deeplabv3Plus and DeepLabV3 with segmentation by @sachinprasadhs in #1869
- Support Gemma2 checkpoint conversion by @jeffcarp in #1905
- BytePairTokenizer must not split sequences of \n by @martin-gorner in #1910
- fix for generation that never stops in Llama3-Instruct variants by @martin-gorner in #1904
- fix failing JAX GPU test by @divyashreepathihalli in #1911
- Refactor
MMDiT
, addImageToImage
andInpaint
for SD3 by @james77777778 in #1909 - Minor bug fix to image_converter.image_size by @sachinprasadhs in #1915
- [Mix Transformer] Add Presets for MiTB0...MiTB5 by @DavidLandup0 in #1893
- remove default resizing for vision backbones by @divyashreepathihalli in #1916
- Update VGG model to be compatible with Timm weights and add conversion scripts by @jeffcarp in #1914
- Deeplab presets by @sachinprasadhs in #1918
- update presets to point to the main Keras Kaggle page by @divyashreepathihalli in #1921
- Added test for the way BytePairTokenizer handles the \n\n sequence, which is important in Lama chat templates by @martin-gorner in #1912
- Task models fix by @martin-gorner in #1922
- adding option strip_prompt to generate() by @martin-gorner in #1913
- Layout map for Llama by @martin-gorner in #1923
- Update deeplab_v3_presets.py by @divyashreepathihalli in #1924
- Add paths to get SAM weights from by @divyashreepathihalli in #1925
- Two fixes for image resizing in preprocessing by @mattdangerw in #1927
- add back default image resizing by @divyashreepathihalli in #1926
- Update deeplab_v3_presets.py by @divyashreepathihalli in #1928
- Update PaliGemma to remove
include_rescaling
arg by @divyashreepathihalli in #1917 - fix path by @sachinprasadhs in #1929
- Fix paligemma checkpoint conversion script by @divyashreepathihalli in #1931
- update preset path to point to latest version of models by @divyashreepathihalli in #1932
- Update sdv3 path by @sachinprasadhs in #1934
- update sam docstring to show correct backbone in docstring by @divyashreepathihalli in #1936
- Convert input dictionary to tensors during train_on_batch by @wenxindongwork in #1919
- Register VGG presets. by @sachinprasadhs in #1935
- Add ResNetVD presets by @gowthamkpr in #1897
- Rename mix_tranformer to mit by @divyashreepathihalli in #1937
- Fix docstrings and args default values by @divyashreepathihalli in #1938
- Update sam_presets.py by @divyashreepathihalli in #1940
- Update vit_det_backbone.py by @divyashreepathihalli in #1941
- fix gpu test by @divyashreepathihalli in #1939
- Added Support for Returning Attention Scores in TransformerEncoder call by @anirudhr20 in #1879
- Mark preset tests as large by @divyashreepathihalli in #1942
New Contributors
- @haifeng-jin made their first contribution in #618
- @mbrukman made their first contribution in #628
- @soma2000-lang made their first contribution in #665
- @NusretOzates made their first contribution in #664
- @shivance made their first contribution in #673
- @Plutone11011 made their first contribution in #714
- @TheAthleticCoder made their first contribution in #731
- @Neeshamraghav012 made their first contribution in #734
- @apupneja made their first contribution in #751
- @Cyber-Machine made their first contribution in #690
- @atharvapurdue made their first contribution in #781
- @fchollet made their first contribution in #747
- @susnato made their first contribution in #771
- @jaygala223 made their first contribution in #805
- @abodinier made their first contribution in #790
- @Akorex made their first contribution in #853
- @Warlord-K made their first contribution in #893
- @Sruinard made their first contribution in #871
- @abuelnasr0 made their first contribution in #882
- @SamuelMarks made their first contribution in #1057
- @ferraric made their first contribution in #1083
- @ianstenbit made their first contribution in #1160
- @qlzh727 made their first contribution in #1230
- @jackd made their first contribution in #1240
- @sahusiddharth made their first contribution in #1245
- @calvingiles made their first contribution in #1262
- @tirthasheshpatel made their first contribution in #1283
- @pnacht made their first contribution in #1305
- @pranavvp16 made their first contribution in #1291
- @mbrhd made their first contribution in #1364
- @dependabot made their first contribution in #1438
- @cpsauer made their first contribution in #1472
- @TheCrazyT made their first contribution in #1484
- @shmishra99 made their first contribution in #1490
- @mykolaskrynnyk made their first contribution in #1502
- @RyanMullins made their first contribution in #1521
- @Wauplin made their first contribution in #1510
- @asmith26 made their first contribution in #1528
- @briango28 made their first contribution in #1523
- @VarunS1997 made their first contribution in #1665
- @martin-gorner made their first contribution in #1669
- @ariG23498 made their first contribution in #1662
- @apehex made their first contribution in #1684
- @cosmo3769 made their first contribution in #1704
- @DavidLandup0 made their first contribution in #1894
- @jeffcarp made their first contribution in #1905
- @wenxindongwork made their first contribution in #1919
- @anirudhr20 made their first contribution in #1879
Full Changelog: v0.4.0...v0.17.0.dev0