Releases · NVIDIA-Merlin/Transformers4Rec

11 Jan 14:03

EmmaQiaoCh

v23.12.00

d0cce61

v23.12.00 Latest

Latest

What's Changed

Use copy-pr-bot by @ajschmidt8 in #742
add rapids infra by @jperez999 in #753
Fix transformer and error on example when CI uses single-GPU by @gabrielspmoreira in #757
Fix version for gdown by @EmmaQiaoCh in #767

New Contributors

@ajschmidt8 made their first contribution in #742
@EmmaQiaoCh made their first contribution in #767

Full Changelog: v23.08.00...v23.12.00

Contributors

gabrielspmoreira, ajschmidt8, and 2 other contributors

Assets 3

29 Aug 16:28

github-actions

v23.08.00

348c963

v23.08.00: adding unit test for end-to-end example (#669)

* adding unit test for multi-gpu example

* added test for notebook 03

* fixed formatting

* update

* Update 01-ETL-with-NVTabular.ipynb

day of week is between 0 and 6; it must be scaled with a max value of 6 to produce correct values from the 0-1 range. If we do col+1 and scale with 7, then a section of the 0-2pi range (for Sine purposes) will not be represented.

* Update 01-ETL-with-NVTabular.ipynb

Reversed the previous edit for weekday scaling. It is correct that it should be scaled between 0-7, because day 0 (unused/nonapplicable after +1 added) overlaps with day 7 for Sine purposes. Monday should scale to 1/7, Sunday should scale to 7/7 to achieve even distribution of days along the sinus curve.

* reduce num_rows

* Update test_end_to_end_session_based.py

* Update 01-ETL-with-NVTabular.ipynb

* updated test script and notebook

* updated file

* removed nb3 test due to multi-gpu freezing issue

* revised notebooks, added back nb3 test

* fixed test file with black

* update test py

* Use `python -m torch.distributed.run` instead of `torchrun`

The `torchrun` script installed in the system is a python script with
a shebang line starting with `#!/usr/bin/python3`

This picks up the wrong version of python when running in a virtualenv
like our tox test environment.

If instead this were `#!/usr/bin/env python3` it would work ok in a
tox environment to call `torchrun`.

However, until either the pytorch package is updated for this to
happen or we update our CI image for this to take place. Running the
python command directly is more reliable.

---------

Co-authored-by: rnyak <[email protected]>
Co-authored-by: edknv <[email protected]>
Co-authored-by: rnyak <[email protected]>
Co-authored-by: Oliver Holworthy <[email protected]>

Assets 3

22 Jun 21:00

github-actions

v23.06.00

b14d07e

v23.06.00

Update merlin dependency versions to match 23.06 release (#724)

Assets 3

31 May 14:40

github-actions

v23.05.00

e5fa050

v23.05.00

What’s Changed

🐜 Bug Fixes

Fixing the projection layer when using weight tying and dim from Transformer output and item embedding differs @gabrielspmoreira (#689)

🚀 Features

Fix the randomness in the stochastic_swap_noise tests @sararb (#707)

📄 Documentation

update end-to-end example to use systems api @radekosmulski (#680)

🔧 Maintenance

Add Transformers Torch Extras to install requirements @oliverholworthy (#699)
Add Conda Package Publish Workflow @oliverholworthy (#688)
Add workflows to check base branch and set stable branch @oliverholworthy (#694)
Rename integration test job from gpu-ci to gpu-ci-integration @oliverholworthy (#687)
Remove padding from NVTabular getting started example @oliverholworthy (#677)
Update tag pattern in GitHub Workflows @oliverholworthy (#695)
don't re-run github actions when PRs get closed @nv-alaiacano (#691)
remove blossom github action @nv-alaiacano (#681)
Add topk arg to return topk items and scores at inference step @rnyak (#678)
Use torch.testing.assert_close to check model outputs @oliverholworthy (#686)

Contributors

gabrielspmoreira, oliverholworthy, and 4 other contributors

Assets 3

26 Apr 21:17

github-actions

v23.04.00

4a9e737

v23.04.00

What’s Changed

🐜 Bug Fixes

Update multi-gpu notebook to set cupy device @edknv (#675)
Fix bug in get_output_sizes_from_schema with core-schema @marcromeyn (#663)
Remove torch.squeeze() step from the model's forward method. @sararb (#659)
Set device in dataloaders @edknv (#654)
Fix the predictions returned by Trainer.predict(..) @sararb (#641)

🚀 Features

Implemented sampled softmax for NextItemPredictionTask @gabrielspmoreira (#671)
Add support for ragged inputs to model @oliverholworthy (#666)
Allow Schema class from core to be used to create Trainer @marcromeyn (#642)
Cleanup shapes in model.input_schema and output_schema @rnyak (#628)
Allow Schema class from core to be used to create TabularSequenceFeatures @marcromeyn (#638)
Allow Schema class from core to be used to create input-blocks @marcromeyn (#634)

📄 Documentation

extend getting started serving example to serve NVT and TF4Rec model together @rnyak (#670)
Cropped the table and addressed comments from previous PR @nzarif (#510)
Update example notebooks to create schema object from merlin core @rnyak (#650)
replace NVTabulardataloader with Merlindataloader @rnyak (#644)

🔧 Maintenance

Update multi-gpu notebook to set cupy device @edknv (#675)
add concurrency setting to stop tests when new commits get pushed to a PR @nv-alaiacano (#673)
Switch to using 2 GPU action runners for multi-GPU testing @karlhigley (#665)
Add workflow to check if base branch of pull request is development @oliverholworthy (#656)
fix the ci script for new unit tests setup @jperez999 (#658)
Add unit test for serving torchscript model example notebook @rnyak (#657)
Separate notebook tests into their own tox environment @nv-alaiacano (#653)
Update usage of use_amp to use_cuda_amp for transformers>=4.20 @oliverholworthy (#627)
Update example notebooks to create schema object from merlin core @rnyak (#650)
Update padding of ragged features to enable dataloader change @oliverholworthy (#647)
fix model output_schema dims for BC/Regression task case @rnyak (#646)
fix test_remove_consecutive_interactions unit test @rnyak (#643)
replace NVTabulardataloader with Merlindataloader @rnyak (#644)
Cleanup shapes in model.input_schema and output_schema @rnyak (#628)
Migrate schema Tags to merlin.schema.Tags @nv-alaiacano (#632)
Clean up imports in tests @marcromeyn (#626)

Contributors

gabrielspmoreira, karlhigley, and 8 other contributors

Assets 3

08 Mar 16:37

github-actions

v23.02.00

bffb847

v23.02.00

What's Changed

🐜 Bug Fixes

Adjust serving notebook to account for underlying shape changes @karlhigley (#631)

🚀 Features

Add docstrings and the parameter to row_groups_per_part to the MerlinDataLoader class @sararb (#590)
Simplify getting-started ETL and fix serving with torch script notebook @rnyak (#604)

📄 Documentation

Update README link - End-to-end pipeline with NVIDIA Merlin @masoncusack (#593)
Fix multi-gpu documentation @bbozkaya (#591)
Transformers4Rec Docs Restructure @lgardenhire (#428)
add BC task script in readme @rnyak (#600)

🔧 Maintenance

fix assert error in the test_soft_embedding unit test @rnyak (#595)
Small fixes in getting-started ETL and training notebooks and fix tuple error in serving notebook @rnyak (#586)
Fetch release branches so that we can figure out the release branch @oliverholworthy (#609)
Add Jenkinsfile @AyodeAwe (#537)
Change data_loader_engine to 'merlin' in examples @edknv (#580)
adding workflow for gpu ci on gha runner @jperez999 (#585)

New Contributors

@lgardenhire made their first contribution in #428
@masoncusack made their first contribution in #593
@AyodeAwe made their first contribution in #537

Full Changelog: v0.1.16...v23.02.00

Contributors

karlhigley, oliverholworthy, and 8 other contributors

Assets 3

03 Feb 19:26

github-actions

v0.1.16

b83d218

v0.1.16

Highlights

1. Standardize the ModelOutput API:

Remove ambiguous flags: ignore_masking and hf_format: #543
Introduce the testing flag to differentiate between evaluation (=True) and inference (=False) modes: #543
All prediction tasks return the same output
#546
- During training and evaluation: the output is a dictionary with three elements: {"loss":torch.tensor, "labels": torch.tensor, "predictions": torch.tensor}
- During inference: The output is the tensor of predictions.

2. Extend the `Trainer` class to support all prediction tasks:

#564

The trainer class is now accepting a T4Rec model defined with binary or regression tasks.
Remove the HFWrapper class as the Trainer is now supporting the base T4Rec Model class.
Set the default of the trainer's argument predict_top_k to 0 instead of 10.
- Note that getting the top-k predictions is specific to NextItemPredictionTask and the user should explicitly set the parameter in the T4RecTrainingArguments object. If not specified, the method Trainer.predict() returns unsorted predictions for the whole item catalog.
Support multi-task learning in the Trainer class: it accepts any T4Rec model defined with multiple tasks and/or multiple heads.

3. Fix the inference performance of the Transformer-based model trained with masked language modeling (MLM):

#551

At inference, the input sequence is extended by a [MASK] embedding after the last non-padded position to take into account the target position. The hidden representation of the [MASK] position is used to get the next-item prediction scores.
With this fix, the user doesn't need to add a dummy position to the input test data when calling Trainer.predict() or model(test_batch, training=False, testing=False)

4. Update Transformers4Rec to use the new merlin-dataloader package: #547

The NVTabularDataLoader is renamed to MerlinDataLoader to use the loader from merlin-dataloader package.
User can specify the argument data_loader_engine=‘merlin’ in the T4RecTrainingArguments object to use the merlin dataloader. It supports GPU and CPU environments. The alias nvtabular is also kept to ensure backward compatibility.

What’s Changed

⚠ Breaking Changes

Extend trainer class to support all T4Rec prediction tasks @sararb (#564)
Standardize prediction tasks' outputs @nzarif (#546)
Uses merlin-dataloader package @edknv (#547)
Refactoring part1- flags modification @nzarif (#543)

🐜 Bug Fixes

Fix error raised by latest Torchmetrics (0.11.0) @sararb (#576)
Fix the test data path in Trainer.predict() @sararb (#571)
Fix discrepancy between evaluation and inference modes @sararb (#551)

🚀 Features

Support to pre-trained embeddings initializer (trainable or not) @gabrielspmoreira (#572)
Extend trainer class to support all T4Rec prediction tasks @sararb (#564)
Standardize prediction tasks' outputs @nzarif (#546)
Add music-streaming synthetic data to test the support of all predictions tasks with the Trainer class @sararb (#540)
Refactoring part1- flags modification @nzarif (#543)

📄 Documentation

Address review feedback @mikemckiernan (#562)
Serving tfrec with pyt backend example @rnyak (#552)
docs: Add basic SEO configuration @mikemckiernan (#518)
docs: Add semver to calver banner @mikemckiernan (#520)
Minor updates to notebook texts @bbozkaya (#548)

🔧 Maintenance

Update mypy version from 0.971 to 0.991 @oliverholworthy (#574)
Uses merlin-dataloader package @edknv (#547)
fix drafter and update cpu ci to run on targeted branch @jperez999 (#549)
Add lint workflow to run pre-commit on all files @oliverholworthy (#545)
Specify packages to look for in setup.py to avoid publishing tests @oliverholworthy (#529)
Cleanup tensorflow dependencies @oliverholworthy (#530)
Add docs requirements to extras list in setup.py (#533)
Remove stale documentation reviews (#531)
Update branch name extraction for tag builds (#608)
run github action tests and lint via tox, with upstream deps installed (#527)

Contributors

gabrielspmoreira, oliverholworthy, and 7 other contributors

Assets 3

22 Nov 19:27

github-actions

v0.1.15

6582b65

v0.1.15

What’s Changed

🐜 Bug Fixes

Fix failing ci error related to sparse_names containing features that are not part of the model's schema @sararb (#541)
Fix dtype mismatch in CLM masking class due to new data loader changes @sararb (#539)
Fix CI test based on the requirements of the new merlin loader @sararb (#536)
quick fix: apply masking when training next item prediction @nzarif (#514)

🚀 Features

Add save/load & input/output schema methods to T4Rec Model class @sararb (#507)

📄 Documentation

Add multi-gpu training example for T4Rec PyTorch @bbozkaya (#521)

🔧 Maintenance

Fix failing ci error related to sparse_names containing features that are not part of the model's schema @sararb (#541)
Fix CI test based on the requirements of the new merlin loader @sararb (#536)
Specify output dtype for Normalize op in ETL example to match model expectations @oliverholworthy (#523)
Fix name and bug in MeanReciprocalRankAt @rnyak (#522)
Update mypy version to match version in pre-commit-config @oliverholworthy (#517)

Contributors

oliverholworthy, nzarif, and 3 other contributors

Assets 3

24 Oct 18:15

github-actions

v0.1.14

4f23a8b

v0.1.14: Multi-GPU training with DP and DDP documentation (#503)

What’s Changed

🚀 Features

Set ignore_masking to True by default @sararb (#498)
[feature]Multi-GPU DistributedDataParallel Fixed @nzarif (#496)

📄 Documentation

Multi-GPU training with DP and DDP documentation @nzarif (#503)

Contributors

nzarif and sararb

Assets 3

26 Sep 18:06

github-actions

v0.1.13

bcc9392

v0.1.13

What’s Changed

🐜 Bug Fixes

[BUG] trainer.model.module renamed and DataParallel mode fixed @nzarif (#483)

🔧 Maintenance

Adjust the device used in synthetic data generation @karlhigley (#486)

Contributors

karlhigley and nzarif

Assets 3

Releases: NVIDIA-Merlin/Transformers4Rec

v23.12.00

What's Changed

New Contributors

Contributors

v23.08.00: adding unit test for end-to-end example (#669)

v23.06.00

v23.05.00

What’s Changed

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v23.04.00

What’s Changed

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v23.02.00

What's Changed

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

New Contributors

Contributors

v0.1.16

Highlights

1. Standardize the ModelOutput API:

2. Extend the Trainer class to support all prediction tasks:

3. Fix the inference performance of the Transformer-based model trained with masked language modeling (MLM):

4. Update Transformers4Rec to use the new merlin-dataloader package: #547

What’s Changed

⚠ Breaking Changes

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v0.1.15

What’s Changed

🐜 Bug Fixes

🚀 Features

📄 Documentation

🔧 Maintenance

Contributors

v0.1.14: Multi-GPU training with DP and DDP documentation (#503)

What’s Changed

🚀 Features

📄 Documentation

Contributors

v0.1.13

What’s Changed

🐜 Bug Fixes

🔧 Maintenance

Contributors

2. Extend the `Trainer` class to support all prediction tasks: