Releases: keras-team/keras-hub
v0.8.2
Summary
- Mistral fixes for dtype and memory usage. #1458
What's Changed
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Version bump for dev release by @mattdangerw in #1474
Full Changelog: v0.8.1...v0.8.2.dev0
v0.8.1
Minor fixes to Kaggle Gemma assets.
What's Changed
- Update to the newest version of Gemma on Kaggle by @mattdangerw in #1454
- Dev release 0.8.1.dev0 by @mattdangerw in #1456
- 0.8.1 version bump by @mattdangerw in #1457
Full Changelog: v0.8.0...v0.8.1
v0.8.0
The 0.8.0 release focuses on generative LLM features in KerasNLP.
Summary
- Added the
Mistral
andGemma
models. - Allow passing
dtype
directly to backbone and task constructors. - Add a settable
sequence_length
property to all preprocessing layers. - Added
enable_lora()
to the backbone class for parameter efficient fine-tuning. - Added layer attributes to backbone models for easier access to model internals.
- Added
AlibiBias
layer.
# Pass dtype to a model.
causal_lm = keras_nlp.MistralCausalLM.from_preset(
"mistral_instruct_7b_en",
dtype="bfloat16"
)
# Settable sequence length property.
causal_lm.preprocessor.sequence_length = 128
# Lora API.
causal_lm.enable_lora(rank=4)
# Easy layer attributes.
for layer in causal_lm.backbone.transformer_layers:
print(layer.count_params())
What's Changed
- Fix test for recent keras 3 change by @mattdangerw in #1400
- Pass less state to jax generate function by @mattdangerw in #1398
- Add llama tokenizer by @mattdangerw in #1401
- Add Bloom Model by @abuelnasr0 in #1382
- Try fixing tests by @mattdangerw in #1411
- Revert "Pass less state to jax generate function (#1398)" by @mattdangerw in #1412
- Bloom tokenizer by @abuelnasr0 in #1403
- Update black formatting by @mattdangerw in #1415
- Add Alibi bias layer by @abuelnasr0 in #1404
- Pin to
tensorflow-hub 0.16.0
to fix CI error by @sampathweb in #1420 - Update TF Text and remove TF Hub deps by @sampathweb in #1423
- Pin Jax Version in GPU CI by @sampathweb in #1430
- Add Bloom preprocessor by @abuelnasr0 in #1424
- Add layer attributes for all functional models by @mattdangerw in #1421
- Allow setting dtype per model by @mattdangerw in #1431
- Add a Causal LM model for Mistral by @tirthasheshpatel in #1429
- Fix bart by @mattdangerw in #1434
- Add a settable property for sequence_length by @mattdangerw in #1437
- Add dependabot to update GH Actions and Python dependencies by @pnacht in #1380
- Bump the github-actions group with 1 update by @dependabot in #1438
- Add 7B presets for Mistral by @tirthasheshpatel in #1436
- Update byte_pair_tokenizer.py to close merges file properly by @divyashreepathihalli in #1440
- bump version to 0.8 by @mattdangerw in #1441
- Update our sampler documentation to reflect usage by @mattdangerw in #1444
- Add Gemma model by @mattdangerw in #1448
- Version bump for dev release by @mattdangerw in #1449
- Version bump to 0.8.0 by @mattdangerw in #1450
New Contributors
- @dependabot made their first contribution in #1438
- @divyashreepathihalli made their first contribution in #1440
Full Changelog: v0.7.0...v0.8.0
v0.17.0.dev0
Summary
- 📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
- Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
- A new preprocessor flow has been added for vision and audio models
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
self
in calls tosuper()
by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbone
base class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_dim
inTransformerDecoder
's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embedding
as a Backbone Property by @abheesht17 in #676 - Move
from_preset
to base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifier
by @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessor
by @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once end_t...
v0.7.0
This release integrates KerasNLP and Kaggle Models. KerasNLP models will now work in Kaggle offline notebooks and all assets will quickly attach to a notebook rather than needing a slow download.
Summary
KerasNLP pre-trained models are now all made available through Kaggle Models. You can see all models currently available in both KerasCV and KerasNLP here. Individual model pages will include example usage and a file browser to examine all available assets for a model preset.
This change will not affect the existing usage of from_preset()
. Statement like keras_nlp.models.BertClassifier.from_preset("bert_base_en")
will continue to work and download checkpoints from the Kaggle Models hub.
A note on model saving—for saving support across Keras 2 and Keras 3, we recommend using the new Keras saved model format. You can use model.save('path/to/location.keras')
for a full model and model.save_weights('path/to/location.weights.h5')
for checkpoints. See the Keras saving guide for more details.
What's Changed
- Don't export model internals publicly by @mattdangerw in #1255
- Bump master branch version number to 0.7.0.dev0 by @mattdangerw in #1254
- Fix/allow different encoder and decoder feature dimensions in transformer decoder layer by @ferraric in #1260
- Doc updates to switch branding to Keras 3 by @mattdangerw in #1259
- Remove unused TPU testing for backbones by @mattdangerw in #1266
- Make gelu a function, not a lambda so it can be loaded without safe_mode=False by @calvingiles in #1262
- Update requirements and install instructions for multi-backend keras by @mattdangerw in #1257
- Support Keras 3 installation by @mattdangerw in #1258
- Remove dtensor by @mattdangerw in #1268
- Add a lora dense layer by @mattdangerw in #1263
- Factor out testing routines for models by @mattdangerw in #1269
- Convert T5 to Keras 3 by @nkovela1 in #1274
- Fix missing backticks in DistilBertClassifier docstrings by @Philmod in #1278
- T5 checkpoint conversion with HF by @nkovela1 in #1277
- Use gelu_approximate directly in t5 presets by @mattdangerw in #1284
- Add preset tests and weights URLs by @nkovela1 in #1285
- Support loading keras 3 nightly by @mattdangerw in #1286
- Remove the use of
SentencePieceTrainer
from tests by @tirthasheshpatel in #1283 - Fix XLM-RoBERTa detokenize() by @abheesht17 in #1289
- Correct tie_embedding_weights and add logit checking by @nkovela1 in #1288
- Add detokenize testing for model tokenizers by @mattdangerw in #1290
- Fix Whisper by @abheesht17 in #1287
- Test against Keras 3 by @mattdangerw in #1273
- Support TF_USE_LEGACY_KERAS by @mattdangerw in #1295
- Run workflows with read-only tokens by @pnacht in #1305
- Update CONTRIBUTING.md by @mattdangerw in #1310
- Add GitHub Action for Nightly by @sampathweb in #1309
- Fix the publish to pypi action by @mattdangerw in #1311
- Fix nightly tf failure by @mattdangerw in #1316
- Switch deberta to use the "int" dtype by @mattdangerw in #1315
- Add security policy by @pnacht in #1319
- Fix missing export for reversible embedding by @mattdangerw in #1327
- Add
version
API to keras_nlp by @grasskin in #1324 - Fix Keras 3 version check by @sampathweb in #1328
- Simplify running KerasNLP with Keras 3 by @mattdangerw in #1308
- Fix issues with version by @mattdangerw in #1332
- Fix typo in whisper presets files by @mattdangerw in #1337
ELECTRA
backbone implementation in keras by @pranavvp16 in #1291- Fix t5 tokenizer expected output by @mattdangerw in #1348
- Add init.py for electra by @mattdangerw in #1352
- Remove lora dense for now by @mattdangerw in #1359
- Adds Kokoro Build script for Keras-NLP GPU tests by @sampathweb in #1355
- Fixes GPU Test failures for Keras 3 by @sampathweb in #1361
- Change Continuous config to also run only large tests by @sampathweb in #1362
- ElectraTokenizer by @pranavvp16 in #1357
- Add MistralAI's 7B Transformer as a backbone in KerasNLP Models by @tirthasheshpatel in #1314
- changing pooling output by @mbrhd in #1364
- Add
LlamaBackbone
by @shivance in #1203 - Align pip_build with keras by @sampathweb in #1374
- Remove cloudbuild config by @mattdangerw in #1375
- Fix one last bad preset hash by @mattdangerw in #1381
- Add a tokenizer for the Mistral backbone by @tirthasheshpatel in #1383
- Kaggle Presets by @sampathweb in #1365
- Fix mistral and electra tokenizer to match kaggle changes by @mattdangerw in #1387
- Align requirments with Keras by @sampathweb in #1386
- Add a preprocessor for the Mistral backbone by @tirthasheshpatel in #1385
- Switch to always expect full Kaggle preset handles by @mattdangerw in #1390
New Contributors
- @calvingiles made their first contribution in #1262
- @tirthasheshpatel made their first contribution in #1283
- @pnacht made their first contribution in #1305
- @grasskin made their first contribution in #1324
- @pranavvp16 made their first contribution in #1291
- @mbrhd made their first contribution in #1364
Full Changelog: v0.6.4...v0.7.0
v0.6.4
Summary
This point release simplifies our support for Keras 3 and Keras 2.
- If Keras 2 is installed, KerasNLP will use Keras 2 and TensorFlow.
- If Keras 3 is installed, KerasNLP will use Keras 3 and run on any backend.
If you have any issue installing KerasNLP, please open an issue.
What's Changed
- 0.6.4 cherry picks by @mattdangerw in #1350
- Version bump for 0.6.4.dev0 pre-release by @mattdangerw in #1351
- Version bump for 0.6.4 release by @mattdangerw in #1356
Full Changelog: v0.6.3...v0.6.4
v0.6.3
Summary
This release adds support for running KerasNLP against Keras 3. You can try this today by installing tf-nightly
and tensorflow-text-nightly
.
pip install keras-nlp
pip uninstall -y tensorflow-text tensorflow keras
pip install tensorflow-text-nightly tf-nightly
Otherwise, this release should be a no-op for all users. No new features, no change in default behavior.
Upcoming changes
After the release of Keras 3, we will drop support for running KerasNLP against the Keras Core package (no more import keras_core as keras
), in favor of Keras 3. Keras 3 is the long-term replacement for Keras Core.
What's Changed
- Cherry picks for 0.6.3 by @mattdangerw in #1297
- Version bump 0.6.3 by @mattdangerw in #1298
- Bump the version to 0.6.3.dev1 by @mattdangerw in #1301
- Version bump to 0.6.3 by @mattdangerw in #1302
Full Changelog: v0.6.2...v0.6.3
v0.6.2
Summary
- Support mixed precision on keras-core on all of jax, torch and tensorflow.
- Add
keras_nlp.layers.RotaryEmbedding
for rotary embeddings. - Add
keras_nlp.layers.ReversibleEmbedding
to better support tied or untied weights for logit projections. - Many bug fixes and improvements.
What's Changed
- Generic
RotaryEmbedding
Layer by @shivance in #1180 - Raise ValueError when number of dims evaluate to zero by @sampathweb in #1198
- Add XLNetBackbone by @susnato in #1084
- Switch from tf.nest to dm-tree by @mattdangerw in #1199
- Fix CI for keras-core 0.1.4 by @mattdangerw in #1202
- Fix ModuleNotFoundError
keras_nlp.models.xlnet
by @shivance in #1204 - Add support for "untied" embedding weights in language models by @mattdangerw in #1201
- Add start_index argument to all position embedding layers by @mattdangerw in #1209
- Remove windows line endings by @mattdangerw in #1210
- Fix Autograph error with perplexity metric by @shivance in #1211
- [JAX backend]: Fix errors with perplexity by @shivance in #1213
- Improve layer naming consistency by @mattdangerw in #1219
- Stop asserting key order in bart preprocessor by @mattdangerw in #1221
- Remove file level docstrings by @mattdangerw in #1222
- Fix typos by @mattdangerw in #1220
- Typo fix by @mattdangerw in #1223
- Fix RotaryEmbedding import by @shivance in #1217
- Update transformer_decoder for the proper naming of the sublayers. by @qlzh727 in #1230
- Replace tf with numpy by @mattdangerw in #1232
- Update to always using ops.shape by @mattdangerw in #1231
- Add a test harness based on keras-core's
run_layer_test
by @mattdangerw in #1238 - fixed token_to_id doc + error msg by @jackd in #1240
- Changed default TokenAndPositionEmbedding initializer to 'uniform' by @jackd in #1237
- Add compat shims for the upcoming keras-core release by @mattdangerw in #1244
- Depend on latest keras-core by @mattdangerw in #1246
- Removed the undefined self.sequence_length by @sahusiddharth in #1245
- Bump devcontainer to 3.9 by @mattdangerw in #1249
- Add a mixed precision test and fix mixed precision errors for layers by @mattdangerw in #1242
- Quick fix for 0.1.7 keras-core release by @mattdangerw in #1251
- Small docstring fixes for the upcoming release by @mattdangerw in #1253
New Contributors
- @qlzh727 made their first contribution in #1230
- @jackd made their first contribution in #1240
- @sahusiddharth made their first contribution in #1245
Full Changelog: v0.6.1...v0.6.2
v0.6.1
With the 0.6.1. release, all remaining models, metrics and samplers have been ported to keras-core. The full KerasNLP API is now available on TensorFlow, PyTorch and Jax (instructions).
Summary
- FNet and DeBERTa are now multi-backend.
- All
keras_nlp.models.FNetXX
andkeras_nlp.models.DebertaV3XX
symbols work on all backends.
- All
keras_nlp.samplers.BeamSampler
andkeras_nlp.samplers.ContrastiveSampler
work on all backends.- All
keras_nlp.metrics
classes work on all backends.- For Jax and PyTroch, pass python strings to metrics (as tensor strings are strictly tensorflow).
- Restored the
mask_positions
named argument toMaskedLMHead
.
What's Changed
- Update README for Keras Core by @jbischof in #1135
- Ignore errors in UTF-8 decoding by @abheesht17 in #1150
- Ports GPTNeoX to KerasCore by @shivance in #1137
- Small fix for mixed precision generation on tf by @mattdangerw in #1153
- Port DeBERTa to multi-backend by @abheesht17 in #1155
- Change all tensors passed to tf.data.Dataset to numpy by @mattdangerw in #1161
- Fix broken tests by @mattdangerw in #1163
- Pin keras-core to 0.1.0 while investigating failures by @mattdangerw in #1168
- Run GPU tests on Jax + Torch by @ianstenbit in #1160
- Fix flakes in masked lm testing by removing any indeterminism by @mattdangerw in #1171
- Always install the correct version with pip_build by @mattdangerw in #1174
- Remove tests for preprocessing inside a functional model by @mattdangerw in #1175
- Extend the timeout for large tests by @mattdangerw in #1103
- Add
GPTNeoXCausalLM
by @shivance in #1110 - Bump tensorflow to latest stable by @mattdangerw in #1170
- Add compute_output_shape to tokenizer by @shivance in #1166
- Stop pinning keras-core by @mattdangerw in #1178
- Port FNet by @abheesht17 in #1164
- Automate the update image flow by @mattdangerw in #1179
- Restore mask_position argument name by @mattdangerw in #1185
- Port contrastive sampler to multi-backend by @mattdangerw in #1187
- Port
BeamSampler
to core by @shivance in #1181 - Port metrics to multi-backend by @mattdangerw in #1186
New Contributors
- @ianstenbit made their first contribution in #1160
Full Changelog: v0.6.0...v0.6.1
v0.6.0
KerasNLP is adding experimental support for Jax and PyTorch backends on top of the Keras Core library. Read the anouncement, and browse the full library documentation, including how to specify the backend when running your code.
Support for both Jax and PyTorch is still experimental, expect some rough edges and please give us feedback!
Summary
- This release should be equivalent to
0.5.2
with the addition of multi-backend support. - The following API symbols are currently restricted to the tensorflow backend:
keras_nlp.models.DebertaV3*
keras_nlp.models.FNet*
keras_nlp.metrics
keras_nlp.samplers.BeamSampler
keras_nlp.samplers.ContrastiveSampler
- Note that there are two ways you can run on top of Tensorflow.
- If you run your scripts/colab without any changes, KerasNLP will use tf.keras for all layer and modeling implementations. This should be a no-op from previous releases of the library.
- If you run your scripts/colab with
KERAS_BACKEND={jax, torch, tensorflow}
, you will be trying the new Keras Core library, using the specified backend. This is a great way to test out the future of the library! - Full details on runtime specification is available along with the Keras Core documentation.
What's Changed
- small updates to the release doc by @chenmoneygithub in #1031
- Sampler docstring edit by @abuelnasr0 in #1033
- Fix program crash for id_to_token() method in SentencePieceTokenizer by @abuelnasr0 in #1040
- Update our release process to preview docs before release by @mattdangerw in #1043
- Add Whisper Tokenizer and Audio Feature Extractor by @abheesht17 in #847
- Also strip padding token for opt by @mattdangerw in #1028
- Add regex dep by @mattdangerw in #1044
- Add BartSeq2SeqLM and conditional text generation with BART by @abheesht17 in #974
- Support list/tuple inputs for special tokens in StartEndPacker layer by @abheesht17 in #1045
- Support list/tuple inputs for special tokens in MultiSegmentPacker layer by @abheesht17 in #1046
- Fix a misleading part of our cached MHA docs by @mattdangerw in #1048
- Always pass weight name by kwarg by @mattdangerw in #1053
- Always pass metrics in a list or dict by @mattdangerw in #1054
- Move
Defaults to
to end of arg docstring and standardise values by @SamuelMarks in #1057 - Fix beam search for BART by @abheesht17 in #1058
- Replace tf.dtype with "dtype" by @mattdangerw in #1059
- Test shapes directly by @mattdangerw in #1064
- Clean up metrics tests by @mattdangerw in #1063
- Remove metrics merge tests by @mattdangerw in #1065
- Fix whisper feature inputs by @mattdangerw in #1069
- Always specify shape when creating variables by @mattdangerw in #1067
- Remove ragged support from position embeddings by @mattdangerw in #1068
- Clean up dtype handling for preprocessing layers by @mattdangerw in #1066
- Add BART finetuned on CNN+DM for summarisation by @abheesht17 in #1060
- Fix saving bug by @mattdangerw in #1073
- Fix t5 forward pass by @mattdangerw in #1082
- Feat/make transformer decoder callable without causal mask by @ferraric in #1083
- Adding
GPTNeoXBackbone
by @shivance in #1056 - Add a common test case by @mattdangerw in #1095
- Update register_keras_serializable to use saving module by @mattdangerw in #1094
- Don't test tf format by @mattdangerw in #1104
- Add
GPTNeoXPreprocessor
by @shivance in #1093 - Split layers into layers/modeling & layers/preprocessing by @mattdangerw in #1102
- Fix merge conflict from #1102 by @mattdangerw in #1105
- Add a common base class for generative models by @mattdangerw in #1096
- Add
GPTNeoXCausalLMPreprocessor
by @shivance in #1106 - Add Whisper Presets by @abheesht17 in #1089
- Refactor
RotaryEmbedding
andGPTNeoXAttention
by @shivance in #1101 - Remove all the secret keys for ci by @mattdangerw in #1126
- Fix publish to pypi action by @mattdangerw in #1127
- Unexport models that are not in the 0.6 release by @mattdangerw in #1125
- Bump the version to 0.6.0 by @mattdangerw in #1128
New Contributors
- @SamuelMarks made their first contribution in #1057
- @ferraric made their first contribution in #1083
Full Changelog: v0.5.2...v0.6.0