v2.9.0
Release 2.9
Major Features and Improvements
- New FastBertNormalizer that improves speed for BERT normalization and is convertible to TF Lite.
- New FastBertTokenizer that combines FastBertNormalizer and FastWordpieceTokenizer.
- New ngrams kernel for handling STRING_JOIN reductions.
Bug Fixes and Other Changes
- NgramsStringJoin shape inference fixed to handle unranked tensors
- Upgrade pybind11 and reenable tests that were broken.
- Rename a couple files to match the naming of the other tflite kernels. Also adds some deps to tflite_ops that were missing and causing an error when testing
:all
. - Add to TF Lite documentation that ngrams is a convertible op.
- Fix public access and missing ICU data to build_fast_bert_normalizer_model and enable the disabled tests.
- Update the doc for FastWordpieceTokenizer.
- Refine the doc for FastWordpieceTokenizer.
- Bug fix: make BertTokenizer work for RaggedTensors with row_splits_dtype=int32
- Fix typo error text.WordpieceTokenizer
- Added comma at missing places in emoticons for normalizer
- Refactor build and test scripts to use prepare_tf_dep.sh
- Fixes prepare_tf_dep.sh for OSX.
- Fixed bug in setup.py that was requiring the wrong version.
- Updated package with the correct versions of Python we release on.
- Update documentation on TF Lite convertible ops.
- Transition to use TF's version of bazel.
- Transition to use TF's bazel configuration.
- Add missing symbols for tokenization layers
- Fix typo in text_generation.ipynb
- Fix grammar typo
- Allow fast wordpiece tokenizer to take in external wordpiece model.
- Internal change
- Improvement to guide where mean call is redundant. See #810 for more info.
- Update broken link and fix typo in BERT-SNGP demo notebook
- Consolidate disparate test-related files into a single testing_infra folder.
- Pin tf-text version to guides & tutorials.
- Fix bug in constrained sequence op. Added a check on an edge case where num_steps = 0 should do nothing and prevent it from SIGSEV crashes.
- Remove outdated Keras tests due to them no longer making the testing utilities available.
- Update bert preprocessing by padding correct tensors
- Update tensorflow-text notebooks from 2.7 to 2.8
- Optimize FastWordPiece to only generate requested outputs.
- Add a note about byte-indexing vs character indexing.
- Add a MAX_TOKENS to the transformer tutorial.
- Only export tensorflow symbols from shared libs.
- (Generated change) Update tf.Text versions and/or docs.
- Do not run the prepare_tf_dep script for Apple M1 macs.
- Update text_classification_rnn.ipynb
- Fix the exported symbols for the linker test. By adding it to the share objects instead of the c++ code, it allows for the code to be compiled together in one large shared lib.
- Implement FastBertNormalizer based on codepoint-wise mappings.
- Add pybind for fast_bert_normalizer_model_builder.
- Remove unused comments related to Python 2 compatibility.
- update transformer.ipynb
- Update toolchain & temporarily disable tf lite tests.
- Define manylinux2014 for the new toolchain target, and have presubmits use it.
- Move tflite build deps to custom target.
- Add FastBertTokenizer.
- Update bazel version to 5.1.0
- Update TF Text to use new Ngrams kernel.
- Don't try to set dimension if shape is unknown for ngrams.
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
Aflah, Connor Brinton, devnev39, Janak Ramakrishnan, Martin, Nathan Luehr, Pierre Dulac, Rabin Adhikari, gadagashwini, mohantym, rtg0795