Releases: keras-team/keras-hub
Releases · keras-team/keras-hub
v0.18.1
Summary
- Minor bug fix point release.
- Remove einops code from flux model.
- Fix specifying dtype during task
from_preset
.
What's Changed
- Adding PaliGemma2 to KerasHub by @divyashreepathihalli in #1998
- Update pali_gemma_presets.py by @divyashreepathihalli in #2003
- Remove einops dep by @mattdangerw in #2006
- Fix dtype when creating a task with a task.json by @mattdangerw in #2007
- Version bump dev release by @mattdangerw in #2009
- Version bump release by @mattdangerw in #2010
Full Changelog: v0.18.0...v0.18.1
v0.18.0
Summary
- New Models.
- PaliGemma 2: Better performing PaliGemma release based on Gemma 2.
- SegFormer: Introduced the SegFormer architecture for SemanticSegmentation.
- CLIP.
- EfficientNet: Added EfficientNet presets, including the Edge and lite0 variants.
- RetinaNet: Added an object detection task model.
- Stable Diffusion: Added SD3.5 large and large turbo presets and flash attention support.
- HuggingFace integration.
- All Keras team presets are now on both Kaggle and Huggingface hubs.
Breaking Changes.
- Updated initialization parameters for SD3, replacing
height
andwidth
withimage_shape
.
What's Changed
- version bump to 0.17.0.dev0 by @divyashreepathihalli in #1944
- Update stable_diffusion_3_presets.py path by @divyashreepathihalli in #1946
- [Semantic Segmentation] - Add SegFormer Architecture, Weight Conversion Script and Presets by @DavidLandup0 in #1883
- Update readme by @divyashreepathihalli in #1949
- Update llama_backbone.py docstring by @divyashreepathihalli in #1950
- Update path for Llama by @sachinprasadhs in #1953
- Update SD3 init parameters (replacing
height
,width
withimage_shape
) by @james77777778 in #1951 - Update docstring by @sachinprasadhs in #1954
- Add
CLIP
model by @james77777778 in #1955 - Add EfficientNet Presets by @pkgoogle in #1933
- Add SD3.5 large and large turbo presets by @james77777778 in #1960
- Mirror all weights on HF from Kaggle by @divyashreepathihalli in #1959
- [T5 1.1] Enable v1.1 Presets by @DavidLandup0 in #1948
- Update preset path for SD 3.5 and T5 1.1 by @divyashreepathihalli in #1961
- minor fix to HF mirror script by @divyashreepathihalli in #1962
- Add presets for CLIP and fix some minor bugs by @james77777778 in #1964
- sync models and update mirror script to sync model cards on HF and Kaggle by @divyashreepathihalli in #1971
- Bump the python group with 5 updates by @dependabot in #1969
- [MiT and SegFormer] Refactor Backbone Arg Names by @DavidLandup0 in #1958
- Correct model card links for Gemma variants by @RyanMullins in #1972
- [RetinaNet] Image Converter and ObjectDetector by @sineeli in #1906
- Improve future compatibility of
CLIPMultiHeadAttention
by @james77777778 in #1975 - Fix
return_attention_scores
bug by @abheesht17 in #1977 - Correct the kaggle handle by @sineeli in #1982
- Add Efficientnet Edge presets by @pkgoogle in #1976
- update docstring examples by @sachinprasadhs in #1970
- [Flux] Port Flux Core Model by @DavidLandup0 in #1864
- Add closest EfficientNet variants by @pkgoogle in #1967
- Sync HF <> Kaggle by @divyashreepathihalli in #1986
- EfficientNet: Add lite0 variant by @pkgoogle in #1968
- Update README.md by @mattdangerw in #1990
- Reduce the metadata we track per preset by @mattdangerw in #1991
- Temp fix for keras-hub testing by @mattdangerw in #1996
- Version bump to 0.18.0.dev0 by @divyashreepathihalli in #2001
- Skip failing JAX test by @divyashreepathihalli in #2000
- Version bump to 0.18.0 and cherry pick by @divyashreepathihalli in #2002
Full Changelog: v0.17.0...v0.18.0
v0.17.0
Summary
- 📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
- Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
- A new preprocessor flow has been added for vision and audio models
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
self
in calls tosuper()
by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbone
base class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_dim
inTransformerDecoder
's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embedding
as a Backbone Property by @abheesht17 in #676 - Move
from_preset
to base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifier
by @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessor
by @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once end_t...
v0.16.0.dev0
Summary
- 📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
- Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
- A new preprocessor flow has been added for vision and audio models
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
self
in calls tosuper()
by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbone
base class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_dim
inTransformerDecoder
's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embedding
as a Backbone Property by @abheesht17 in #676 - Move
from_preset
to base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifier
by @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessor
by @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once en...
v0.15.1
Summary
Bug fix patch release.
- Always run tf preprocessing on CPU.
- Fix running preprocessing outside the main python thread.
- Fix loading classifiers with the "old name" of
XXClasssifier
asXXTextClassifier
. - Restore support for bytestring to tokenizers and other preprocessing layers as strings.
What's Changed
- Version bump for pre-release by @mattdangerw in #1842
- V0.15.1.dev1 by @mattdangerw in #1844
- Version bump for 0.15.1 release by @mattdangerw in #1845
Full Changelog: v0.15.0...v0.15.1
v0.15.0
Summary
📢 KerasNLP is becoming KerasHub 📢, read more about it here.
This release contains a number of feature improvements:
- Added int8 quantization support.
- Use the
quantize()
method to quantize any model. - Llama 2 and Llama 3 pre-quantized presets are available.
- Use the
- PaliGemmaCausalLM will automatically resize input images during preprocessing.
- Added more converters for hugginface/transformers checkpoints.
- Gemma 2, PaliGemma, GPT2, Bert, Albert, DistilBert, Bart.
- Class detection for huggingface/transformers checkpoints.
- Call
from_preset()
on a base class, and we will find the correct subclass to create.
- Call
- Added Vicuna presets.
- Alias
Classifier
asTextClassifier
,BertClassifier
asBertTextClassifier
. - Added
tokenizer.special_tokens
andtokenizer.special_token_ids
as convenient properties to view all special tokens on a pretrained tokenizer.
# Quantize an unquantized model.
lm = keras_nlp.models.CausalLM.from_preset(
"gemma2_instruct_2b_en",
dtype="bfloat16",
)
lm.quantize("int8")
# Load a pre-quantized model.
lm = keras_nlp.models.CausalLM.from_preset(
"llama3_instruct_8b_en_int8",
dtype="bfloat16",
)
# Convert a bert model in the huggingface/transformers format.
classifier = keras_nlp.models.TextClassifier.from_preset(
"hf://google-bert/bert-base-uncased",
num_classes=2,
)
# View all special tokens.
print(classifier.preprocessor.tokenizer.special_tokens)
print(classifier.preprocessor.tokenizer.special_token_ids)
Breaking changes
- On all backends, all strings and ragged output will be returned as python strings or python lists respectively.
- This include preprocessing methods like
tokenize()
anddetokenize()
. - This may break code that depended on
tf.Tensor
output on thetensorflow
backend, but will lead to consistent output on all backends, which we believe will be an overall improvement. - Preprocessing layers can still always be included in a
tf.data
preprocessing pipeline, on any backend.
- This include preprocessing methods like
What's Changed
- Version bump to 0.14.0.dev0 by @grasskin in #1675
- Revert "Version bump to 0.14.0.dev0" by @grasskin in #1676
- Remove Keras pin, fix tests by @mattdangerw in #1681
- Add quantization support for
Gemma
,Gemma2
andPaliGemma
by @james77777778 in #1670 - add vicuna preset by @sineeli in #1672
- Porting Gemma 2 transformers checkpoint by @ariG23498 in #1678
- Improve CI speed and resolve issues of
run_quantization_check
by @james77777778 in #1682 - Remove build_from_signature from MHA layers by @mattdangerw in #1687
- Refactoring: in CachedMultiHeadAttention call MHA methods instead of recoding the attention calculation by @apehex in #1684
- Porting PaliGemma transformers checkpoint by @ariG23498 in #1686
- Allow importing keras_nlp without tensorflow by @mattdangerw in #1660
- Add flag to gemma conversion script to specify local orbax by @mattdangerw in #1688
- Fix compatibility for earlier versions of Keras by @james77777778 in #1690
- Add a test against keras-nightly by @mattdangerw in #1693
- Fix dtype bugs in
ReversibleEmbedding
andLayerNorm
by @james77777778 in #1692 - Partially revert #1687 by @mattdangerw in #1695
- Fix quantization test for
XLNet
by @james77777778 in #1699 - Add a HF BERT converter, improve safetensor loading by @mattdangerw in #1694
- Add a subtle fix for gemma 2 conversions by @mattdangerw in #1701
- One more small Gemma conversion fix by @mattdangerw in #1702
- Slightly more defensive handling of type for backbone by @mattdangerw in #1703
- Add support for converting Gemma 2 checkpoints by @mattdangerw in #1700
- Make it clearer what is running in the github action UI by @mattdangerw in #1707
- Try upgrading tensorflow pin by @mattdangerw in #1706
- Bump version to fix query norm in Gemma 2 9b by @mattdangerw in #1709
- Gemma: Add logit soft-capping to score function. by @RyanMullins in #1712
- Version bump HEAD to 0.15 by @mattdangerw in #1713
- Port gpt2 transformers checkpoint by @cosmo3769 in #1704
- Add soft capping to reversible embedding layer by @mattdangerw in #1718
- Add presets for gemma 2 2b by @mattdangerw in #1721
- Utilize
to_numpy=True
inquantize
if available by @james77777778 in #1725 - Dynamic int8 quantization for Llama2 and Llama3 by @james77777778 in #1720
- Bump the python group with 2 updates by @dependabot in #1726
- Shield gemma shortnames by @mattdangerw in #1731
- Sliding window fixes by @mattdangerw in #1738
- Add int8 models to Llama2 and Llama3 by @james77777778 in #1734
- Port distilbert transformer checkpoint by @cosmo3769 in #1736
- Add support of
kwargs
toBackbone.from_preset
and fix the dtype forwarding inTask.from_preset
by @james77777778 in #1742 - Remove src init file contents by @mattdangerw in #1743
- Remove ROADMAP.md by @mattdangerw in #1773
- Fix nested list in args on keras.io by @mattdangerw in #1772
- Remove stale tf only examples by @mattdangerw in #1771
- Limit the default sequence length to 1024 for all models by @mattdangerw in #1770
- Consistent preprocessing output on all backends by @mattdangerw in #1777
- Port albert transformer checkpoint by @cosmo3769 in #1767
- Lower the default learning rate for albert by @mattdangerw in #1786
- Port bart transformer checkpoint by @cosmo3769 in #1783
- Add an option to disable default compilation by @mattdangerw in #1787
- Port mistral transformer checkpoint by @cosmo3769 in #1768
- [Bart]Fix missing weight port by @cosmo3769 in #1789
- Remove python 3.8 version in setup.py by @mattdangerw in #1792
- Class detection works for huggingface checkpoints by @mattdangerw in #1800
- Rename KerasNLP symbols for a multi-modal future by @mattdangerw in #1803
- Move preprocessing to base classes by @mattdangerw in #1807
- Add
add_bos=False, add_eos=False
to SentencePieceTokenizer.init() by @briango28 in #1811 - Only load a full task config when
load_task_extras
is passed by @mattdangerw in #1812 - Add image and audio converter classes by @mattdangerw in #1813
- Simplify registering "built-in" presets by @mattdangerw in #1818
- Support image and audio information in task summaries by @mattdangerw in #1819
- Take two of #1812, simpler classifier head loading by @mattdangerw in #1823
- Remove preprocessing layers we no longer use by @mattdangerw in #1824
- Version bump for dev release by @mattdangerw in #1825
- Version bump for dev release by @mattdangerw in #1830
- Version bump for 0.15.0 release by @mattdangerw in #1832
New Contributors
- @apehex made their first contribution in #1684
- @cosmo3769 made their first contribution in #1704
Full Changelog: v0.14.4...v0.15.0
v0.14.4
Summary
- Fix issues with Gemma 2 sliding window.
- Fix TensorFlow backend Gemma 2 generation.
What's Changed
- Sliding window fixes by @mattdangerw in #1738
- version bump by @mattdangerw in #1740
- version bump by @mattdangerw in #1741
Full Changelog: v0.14.3...v0.14.4
v0.14.3
Summary
- Short names for shield gemma checkpoints.
keras_nlp.models.GemmaCausalLM.from_preset("shieldgemma_2b_en")
What's Changed
- Version bump dev release by @mattdangerw in #1732
- Version bump for release by @mattdangerw in #1733
Full Changelog: v0.14.2...v0.14.3
v0.14.2
Summary
- Add Gemma 2 2b.
- Fixes for logit softcapping.
What's Changed
- Version bump 0.14.2.dev0 by @mattdangerw in #1719
- Bump pypi action version by @mattdangerw in #1722
- version bump by @mattdangerw in #1723
- Version bump 0.14.2 by @mattdangerw in #1724
Full Changelog: v0.14.1...v0.14.2
v0.14.1
Summary
- Update Gemma 2 9b to fix minor config error.
What's Changed
- Bump version to fix query norm in Gemma 2 9b by @mattdangerw in #1709
- Version bump 0.14.1.dev0 by @mattdangerw in #1714
Full Changelog: v0.14.0...v0.14.1