Restrict transformers version until MPS issue is addressed #1039

jmartin-tech · 2024-12-09T18:17:11Z

As of transformers 4.47.0, the device specified for some detectors that utilize huffingface models is not applied in some macOS contexts.

To be specific when cpu is specified mps code paths are being activated and raise exceptions as mps was not configured or initialized.

Verification

List the steps needed to make sure this thing works

Automation test complete

Example failure seen with transformers == 4.47.0:

% python -m pytest tests/test_attempt.py
============================================================================================= test session starts =============================================================================================
platform darwin -- Python 3.12.4, pytest-8.3.4, pluggy-1.5.0
rootdir: /Users/vagrant/Projects/nvidia/garak
configfile: pyproject.toml
plugins: cov-6.0.0, respx-0.21.1, pytest_httpserver-1.1.0, anyio-4.7.0, mock-3.14.0, requests-mock-1.12.1
collected 12 items

tests/test_attempt.py F...........                                                                                                                                                                      [100%]

================================================================================================== FAILURES ===================================================================================================
_________________________________________________________________________________________ test_attempt_sticky_params __________________________________________________________________________________________

capsys = <_pytest.capture.CaptureFixture object at 0x1030c6ba0>

    def test_attempt_sticky_params(capsys):

        cli.main(
            f"-m test.Blank -g 1 -p atkgen,dan.Dan_6_0 --report_prefix {PREFIX}".split()
        )
        report_path = _config.transient.data_dir / _config.reporting.report_dir
        reportlines = (
            open(report_path / f"{PREFIX}.report.jsonl", "r", encoding="utf-8")
            .read()
            .split("\n")
        )
        # Note: the line numbers below are based on respecting the `-g 1` options passed
>       complete_atkgen = json.loads(reportlines[3])  # status 2 for the one atkgen attempt
E       IndexError: list index out of range

tests/test_attempt.py:27: IndexError
-------------------------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------------------------
garak LLM vulnerability scanner v0.10.0.post1 ( https://github.com/NVIDIA/garak ) at 2024-12-09T07:33:48.160729
📜 logging to /Users/vagrant/.local/share/garak/garak.log
🦜 loading generator: Test: Blank
📜 reporting to /Users/vagrant/.local/share/garak/garak_runs/_garak_test_attempt_sticky_params.report.jsonl
🕵️  queue of probes: atkgen.Tox, dan.Dan_6_0
 detector load failed: toxicity.ToxicCommentModel, skipping >>
No detectors, nothing to do
---------------------------------------------------------------------------------------------- Captured log call ----------------------------------------------------------------------------------------------
WARNING  root:_plugins.py:419 Exception instantiating garak.detectors.toxicity.ToxicCommentModel: MPS backend out of memory (MPS allocated: 1.01 GB, other allocations: 16.00 KB, max allowed: 4.53 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
Traceback (most recent call last):
  File "/Users/vagrant/Projects/nvidia/garak/garak/_plugins.py", line 416, in load_plugin
    plugin_instance = klass(config_root=config_root)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/garak/detectors/base.py", line 122, in __init__
    self.detector = TextClassificationPipeline(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/transformers/pipelines/text_classification.py", line 85, in __init__
    super().__init__(**kwargs)
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/transformers/pipelines/base.py", line 926, in __init__
    self.model.to(self.device)
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3164, in to
    return super().to(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1326, in convert
    return t.to(
           ^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 1.01 GB, other allocations: 16.00 KB, max allowed: 4.53 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
ERROR    root:probewise.py:27  detector load failed: toxicity.ToxicCommentModel, skipping >>
WARNING  root:base.py:92 No detectors, nothing to do
ERROR    root:cli.py:620 No detectors, nothing to do
Traceback (most recent call last):
  File "/Users/vagrant/Projects/nvidia/garak/garak/cli.py", line 594, in main
    command.probewise_run(
  File "/Users/vagrant/Projects/nvidia/garak/garak/command.py", line 237, in probewise_run
    probewise_h.run(generator, probe_names, evaluator, buffs)
  File "/Users/vagrant/Projects/nvidia/garak/garak/harnesses/probewise.py", line 107, in run
    h.run(model, [probe], detectors, evaluator, announce_probe=False)
  File "/Users/vagrant/Projects/nvidia/garak/garak/harnesses/base.py", line 95, in run
    raise ValueError(msg)
ValueError: No detectors, nothing to do
=========================================================================================== short test summary info ===========================================================================================
FAILED tests/test_attempt.py::test_attempt_sticky_params - IndexError: list index out of range
======================================================================================== 1 failed, 11 passed in 2.74s =========================================================================================

As of transformers 4.47.0, the `device` specified for some detectors that utilize huffingface models is not applied in some macOS contexts. To be specific when `cpu` is specified `mps` code paths are being activated and raise exceptions as `mps` was not configured or initialized. Signed-off-by: Jeffrey Martin <[email protected]>

leondz · 2024-12-09T18:36:58Z

Lgtm, thanks

jmartin-tech requested review from leondz and erickgalinkin December 9, 2024 18:17

leondz approved these changes Dec 9, 2024

View reviewed changes

jmartin-tech merged commit 0b837c1 into NVIDIA:main Dec 9, 2024
9 checks passed

jmartin-tech deleted the fix/restrict-transformers-version branch December 9, 2024 21:03

github-actions bot locked and limited conversation to collaborators Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict transformers version until MPS issue is addressed #1039

Restrict transformers version until MPS issue is addressed #1039

jmartin-tech commented Dec 9, 2024

leondz commented Dec 9, 2024

Restrict transformers version until MPS issue is addressed #1039

Restrict transformers version until MPS issue is addressed #1039

Conversation

jmartin-tech commented Dec 9, 2024

Verification

leondz commented Dec 9, 2024