Release 2.9

Major Features and Improvements

New FastBertNormalizer that improves speed for BERT normalization and is convertible to TF Lite.
New FastBertTokenizer that combines FastBertNormalizer and FastWordpieceTokenizer.
New ngrams kernel for handling STRING_JOIN reductions.

Bug Fixes and Other Changes

NgramsStringJoin shape inference fixed to handle unranked tensors
Upgrade pybind11 and reenable tests that were broken.
Rename a couple files to match the naming of the other tflite kernels. Also adds some deps to tflite_ops that were missing and causing an error when testing :all.
Add to TF Lite documentation that ngrams is a convertible op.
Fix public access and missing ICU data to build_fast_bert_normalizer_model and enable the disabled tests.
Update the doc for FastWordpieceTokenizer.
Refine the doc for FastWordpieceTokenizer.
Bug fix: make BertTokenizer work for RaggedTensors with row_splits_dtype=int32
Fix typo error text.WordpieceTokenizer
Added comma at missing places in emoticons for normalizer
Refactor build and test scripts to use prepare_tf_dep.sh
Fixes prepare_tf_dep.sh for OSX.
Fixed bug in setup.py that was requiring the wrong version.
Updated package with the correct versions of Python we release on.
Update documentation on TF Lite convertible ops.
Transition to use TF's version of bazel.
Transition to use TF's bazel configuration.
Add missing symbols for tokenization layers
Fix typo in text_generation.ipynb
Fix grammar typo
Allow fast wordpiece tokenizer to take in external wordpiece model.
Internal change
Improvement to guide where mean call is redundant. See #810 for more info.
Update broken link and fix typo in BERT-SNGP demo notebook
Consolidate disparate test-related files into a single testing_infra folder.
Pin tf-text version to guides & tutorials.
Fix bug in constrained sequence op. Added a check on an edge case where num_steps = 0 should do nothing and prevent it from SIGSEV crashes.
Remove outdated Keras tests due to them no longer making the testing utilities available.
Update bert preprocessing by padding correct tensors
Update tensorflow-text notebooks from 2.7 to 2.8
Optimize FastWordPiece to only generate requested outputs.
Add a note about byte-indexing vs character indexing.
Add a MAX_TOKENS to the transformer tutorial.
Only export tensorflow symbols from shared libs.
(Generated change) Update tf.Text versions and/or docs.
Do not run the prepare_tf_dep script for Apple M1 macs.
Update text_classification_rnn.ipynb
Fix the exported symbols for the linker test. By adding it to the share objects instead of the c++ code, it allows for the code to be compiled together in one large shared lib.
Implement FastBertNormalizer based on codepoint-wise mappings.
Add pybind for fast_bert_normalizer_model_builder.
Remove unused comments related to Python 2 compatibility.
update transformer.ipynb
Update toolchain & temporarily disable tf lite tests.
Define manylinux2014 for the new toolchain target, and have presubmits use it.
Move tflite build deps to custom target.
Add FastBertTokenizer.
Update bazel version to 5.1.0
Update TF Text to use new Ngrams kernel.
Don't try to set dimension if shape is unknown for ngrams.

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Aflah, Connor Brinton, devnev39, Janak Ramakrishnan, Martin, Nathan Luehr, Pierre Dulac, Rabin Adhikari, gadagashwini, mohantym, rtg0795

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.9.0

Release 2.9

Major Features and Improvements

Bug Fixes and Other Changes

Thanks to our Contributors