Merge branch 'main' into main

Signed-off-by: Leon Derczynski <[email protected]>
NVIDIA · Oct 11, 2024 · 4d2ba73 · 4d2ba73
2 parents 71c5838 + 1ba46e9
commit 4d2ba73
Show file tree

Hide file tree

Showing 109 changed files with 1,667 additions and 386 deletions.
diff --git a/.github/workflows/cla.yml b/.github/workflows/cla.yml
@@ -14,6 +14,7 @@ permissions:
 
 jobs:
   CLAAssistant:
+    if: github.repository_owner == 'leondz'
     runs-on: ubuntu-latest
     steps:
       - name: "CA & DCO Assistant"

diff --git a/.github/workflows/labels.yml b/.github/workflows/labels.yml
@@ -26,6 +26,7 @@ on:
 
 jobs:
   handle-labels:
+    if: github.repository_owner == 'leondz'
     runs-on: ubuntu-latest
     steps:
       - uses: actions/github-script@v7

diff --git a/.github/workflows/maintain_cache.yml b/.github/workflows/maintain_cache.yml
@@ -19,6 +19,7 @@ permissions:
 
 jobs:
   build:
+    if: github.repository_owner == 'leondz'
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v3

diff --git a/docs/source/configurable.rst b/docs/source/configurable.rst
@@ -129,18 +129,8 @@ For an example of how to use the ``detectors``, ``generators``, ``buffs``,
 * ``show_100_pass_modules`` - Should entries scoring 100% still be detailed in the HTML report?
 
 
-Using a custom JSON config
-^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Some plugins can take a JSON config specified on the command line. This config 
-has the same structure as a YAML config, starting with the plugin model/type.
-The config can either be written to a file and the path passed, with 
-`--generator_option_file` or `--probe_option_file`, or directly as JSON on the
-command prompt, with `--generator_options` or `--probe_options`. An example 
-is given in `RestGenerator Config with JSON <rest_generator_with_json>`_ below.
-
-Examples: quick configs
-^^^^^^^^^^^^^^^^^^^^^^^
+Bundled quick configs
+^^^^^^^^^^^^^^^^^^^^^
 
 Garak comes bundled with some quick configs that can be loaded directly using ``--config``.
 These don't need the ``.yml`` extension when being requested. They include:
@@ -174,8 +164,21 @@ probes and run each prompt just once:
 
 If we save this as ``latent1.yaml`` somewhere, then we can use it with ``garak --config latent1.yaml``.
 
-Plugins
--------
+
+
+Using a custom JSON config
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some plugins can take a JSON config specified on the command line. This config 
+has the same structure as a YAML config, starting with the plugin model/type.
+The config can either be written to a file and the path passed, with 
+`--generator_option_file` or `--probe_option_file`, or directly as JSON on the
+command prompt, with `--generator_options` or `--probe_options`. An example 
+is given in `RestGenerator Config with JSON <rest_generator_with_json>`_ below.
+
+
+Configuring Plugins
+-------------------
 
 Garak's functions are through its plugins. Most parts of garak are plugins,
 like the ``probes`` and ``detectors`` that do the actual examination of the target,
@@ -250,8 +253,8 @@ is an example that is equivalent to the configuration above:
             openai:
                 temperature: 1.0
 
-RestGenerator
-^^^^^^^^^^^^^
+Example: RestGenerator
+^^^^^^^^^^^^^^^^^^^^^^
 
 RestGenerator is a slightly complex generator, though mostly because it exposes
 so many config values, allowing flexible integrations. This example sets 
@@ -317,4 +320,26 @@ This defines a REST endpoint where:
 
 
 This should be written to a file, and the file's path passed on the command 
-line with `-G`. 
+line with `-G`. 
+
+Configuration in code
+---------------------
+
+The preferred way to instantiate a plugin is using ``garak._plugins.load_plugin()``.
+This function takes two parameters:
+
+* ``name``, the plugin's package, module, and class - e.g. ``generator.test.Lipsum``
+* (optional) ``config_root``, either garak._config or a dictionary of a config, beginning at a top-level plugin type.
+
+``load_plugin()`` returns a configured instance of the requested plugin.
+
+OpenAIGenerator config with dictionary
+""""""""""""""""""""""""""""""""""""""
+
+.. code-block:: python
+
+    >>> import garak._plugins
+    >>> c = {"generators":{"openai":{"OpenAIGenerator":{"seed":30,"name":"gpt-4"}}}}
+    >>> garak._plugins.load_plugin("generators.openai.OpenAIGenerator", config_root=c)
+    🦜 loading generator: OpenAI: gpt-4
+    <garak.generators.openai.OpenAIGenerator object at 0x71bc97693d70>
diff --git a/docs/source/garak.generators.guardrails.rst b/docs/source/garak.generators.guardrails.rst
@@ -1,6 +1,24 @@
 garak.generators.guardrails
 ===========================
 
+This is a generator for warpping a NeMo Guardrails configuration. Using this
+garak generator enables security testing of a Guardrails config.
+
+The ``guardrails`` generator expects a path to a valid Guardrails configuration
+to be passed as its name. For example,
+
+.. code-block::
+
+   garak -m guardrails -n sample_abc/config
+
+This generator requires installation of the `guardrails <https://pypi.org/project/nemoguardrails/>`_
+Python package.
+
+When invoked, garak sends prompts in series to the Guardrails setup using 
+``rails.generate``, and waits for a response. The generator does not support
+parallisation, so it's recommended to run smaller probes, or set ``generations``
+to a low value, in order to reduce garak run time.
+
 .. automodule:: garak.generators.guardrails
    :members:
    :undoc-members:

diff --git a/docs/source/garak.generators.nemo.rst b/docs/source/garak.generators.nemo.rst
@@ -1,6 +1,26 @@
 garak.generators.nemo
 =====================
 
+Wrapper for `nemollm <https://pypi.org/project/nemollm/>`_.
+
+Expects NGC API key in the environment variable ``NGC_API_KEY`` and the 
+organisation ID in environment variable ``ORG_ID``.
+
+Configurable values:
+
+* temperature: 0.9
+* top_p: 1.0
+* top_k: 2
+* repetition_penalty: 1.1 - between 1 and 2 incl., or none
+* beam_search_diversity_rate: 0.0
+* beam_width: 1
+* length_penalty: 1
+* guardrail: None -  (present in API but not implemented in library)
+* api_uri: "https://api.llm.ngc.nvidia.com/v1" - endpoint URI
+
+
+
+
 .. automodule:: garak.generators.nemo
    :members:
    :undoc-members:

diff --git a/docs/source/garak.generators.nvcf.rst b/docs/source/garak.generators.nvcf.rst
@@ -1,6 +1,98 @@
 garak.generators.nvcf
 =====================
 
+This garak generator is a connector to NVIDIA Cloud Functions. It permits fast
+and flexible generation.
+
+NVCF functions work by sending a request to an invocation endpoint, and then polling
+a status endpoint until the response is received. The cloud function is described
+using a UUID, which is passed to garak as the ``model_name``. API key should be placed in
+environment variable ``NVCF_API_KEY`` or set in a garak config. For example:
+
+.. code-block::
+
+   export NVCF_API_KEY="example-api-key-xyz"
+   garak -m nvcf -n 341da0d0-aa68-4c4f-89b5-fc39286de6a1
+
+
+Configuration
+-------------
+
+Configurable values:
+
+* temperature - Temperature for generation. Passed as a value to the endpoint.
+* top_p - Number of tokens to sample. Passed as a value to the endpoint.
+* invoke_uri_base - Base URL for the NVCF endpoint (default is for NVIDIA-hosted functions).
+* status_uri_base - URL to check for request status updates (default is for NVIDIA-hosted functions).
+* timeout - Read timeout for HTTP requests (note, this is network timeout, distinct from inference timeout)
+* version_id - API version id, postpended to endpoint URLs if supplied
+* stop_on_404 - Give up on endpoints returning 404 (i.e. nonexistent ones)
+* extra_params - Dictionary of optional extra values to pass to the endpoint. Default ``{"stream": False}``.
+
+Some NVCF instances require custom parameters, for example a "model" header. These
+can be asserted in the NVCF config. For example, this cURL maps to the following
+garak YAML:
+
+
+.. code-block::
+
+   curl -s -X POST 'https://api.nvcf.nvidia.com/v2/nvcf/pexec/functions/341da0d0-aa68-4c4f-89b5-fc39286de6a1' \
+   -H 'Content-Type: application/json' \
+   -H 'Authorization: Bearer example-api-key-xyz' \
+   -d '{
+         "messages": [{"role": "user", "content": "How many letters are in the word strawberry?"}],
+         "model": "prefix/obsidianorder/terer-nor",
+         "max_tokens": 1024,
+         "stream": false
+      }'
+
+.. code-block:: yaml
+
+   ---
+   plugins:
+      generators:
+         nvcf:
+            NvcfChat:
+               api_key: example-api-key-xyz
+               max_tokens: 1024
+               extra_params:
+                  stream: false
+                  model: prefix/obsidianorder/terer-nor
+      model_type: nvcf.NvcfChat
+      model_name: 341da0d0-aa68-4c4f-89b5-fc39286de6a1
+
+The ``nvcf`` generator uses the standard garak generator mechanism for 
+``max_tokens``, which is why this value is set at generator-level rather than 
+as a key-value pair in ``extra_params``.
+
+
+Scaling
+-------
+
+The NVCF generator supports parallelisation and it's recommended to use this,
+invoking garak with ``--parallel_attempts`` set to a value higher than one.
+IF the NVCF times out due to insufficient capacity, garak will note this, 
+backoff, and retry the request later.
+
+.. code-block::
+
+   garak -m nvcf -n 341da0d0-aa68-4c4f-89b5-fc39286de6a1 --parallel_attempts 32
+
+
+Or, as yaml config:
+
+.. code-block:: yaml
+
+   ---
+   system:
+      parallel_attempts: 32
+   plugins:
+      model_type: nvcf.NvcfChat
+      model_name: 341da0d0-aa68-4c4f-89b5-fc39286de6a1
+
+
+
+
 .. automodule:: garak.generators.nvcf
    :members:
    :undoc-members:

diff --git a/docs/source/garak.generators.ollama.rst b/docs/source/garak.generators.ollama.rst
@@ -0,0 +1,8 @@
+garak.generators.ollama
+========================
+
+.. automodule:: garak.generators.ollama
+   :members:
+   :undoc-members:
+   :show-inheritance:   
+
diff --git a/docs/source/generators.rst b/docs/source/generators.rst
@@ -20,6 +20,7 @@ For a detailed oversight into how a generator operates, see :ref:`garak.generato
    garak.generators.langchain_serve
    garak.generators.litellm
    garak.generators.octo
+   garak.generators.ollama
    garak.generators.openai
    garak.generators.nemo
    garak.generators.nim

diff --git a/garak/analyze/calibration.py b/garak/analyze/calibration.py
@@ -10,7 +10,7 @@
 from typing import Union
 
 
-from garak import _config
+from garak.data import path as data_path
 
 MINIMUM_STD_DEV = (
     0.01732  # stddev=0 gives unusable z-scores; give it an arbitrary floor of 3^.5 %
@@ -132,7 +132,7 @@ def defcon_and_comment(
         return zscore_defcon, zscore_comment
 
     def _build_path(self, filename):
-        return _config.transient.package_dir / "resources" / "calibration" / filename
+        return data_path / "calibration" / filename
 
     def __init__(self, calibration_path: Union[None, str, pathlib.Path] = None) -> None:
 

diff --git a/garak/analyze/misp.py b/garak/analyze/misp.py
@@ -9,12 +9,9 @@
 import os
 
 from garak import _plugins
-import garak._config
+from garak.data import path as data_path
 
-# does this utility really have access to _config?
-misp_resource_file = (
-    garak._config.transient.package_dir / "resources" / "misp_descriptions.tsv"
-)
+misp_resource_file = data_path / "misp_descriptions.tsv"
 misp_descriptions = {}
 if os.path.isfile(misp_resource_file):
     with open(misp_resource_file, "r", encoding="utf-8") as f:

diff --git a/garak/analyze/report_digest.py b/garak/analyze/report_digest.py
@@ -7,15 +7,18 @@
 import json
 import markdown
 import os
+import pprint
 import re
 import sys
 
 import jinja2
 import sqlite3
 
 from garak import _config
+from garak.data import path as data_path
 import garak.analyze.calibration
 
+
 if not _config.loaded:
     _config.load_config()
 
@@ -33,9 +36,7 @@
 about_z_template = templateEnv.get_template("digest_about_z.jinja")
 
 
-misp_resource_file = (
-    _config.transient.package_dir / "resources" / "misp_descriptions.tsv"
-)
+misp_resource_file = data_path / "misp_descriptions.tsv"
 misp_descriptions = {}
 if os.path.isfile(misp_resource_file):
     with open(misp_resource_file, "r", encoding="utf-8") as f:
@@ -63,6 +64,7 @@ def plugin_docstring_to_description(docstring):
 
 def compile_digest(report_path, taxonomy=_config.reporting.taxonomy):
     evals = []
+    payloads = []
     setup = defaultdict(str)
     with open(report_path, "r", encoding="utf-8") as reportfile:
         for line in reportfile:
@@ -75,6 +77,12 @@ def compile_digest(report_path, taxonomy=_config.reporting.taxonomy):
                 run_uuid = record["run"]
             elif record["entry_type"] == "start_run setup":
                 setup = record
+            elif record["entry_type"] == "payload_init":
+                payloads.append(
+                    record["payload_name"]
+                    + "  "
+                    + pprint.pformat(record, sort_dicts=True, width=60)
+                )
 
     calibration = garak.analyze.calibration.Calibration()
     calibration_used = False
@@ -85,10 +93,11 @@ def compile_digest(report_path, taxonomy=_config.reporting.taxonomy):
             "garak_version": garak_version,
             "start_time": start_time,
             "run_uuid": run_uuid,
-            "setup": repr(setup),
+            "setup": pprint.pformat(setup, sort_dicts=True, width=60),
             "probespec": setup["plugins.probe_spec"],
             "model_type": setup["plugins.model_type"],
             "model_name": setup["plugins.model_name"],
+            "payloads": payloads,
         }
     )
 

diff --git a/garak/analyze/templates/digest_about_z.jinja b/garak/analyze/templates/digest_about_z.jinja
@@ -6,7 +6,7 @@
 <p>About Z-scores in this analysis:</p>
 <ul>
 <li>Positive Z-scores mean better than average, negative Z-scores mean worse than average.</li>
-<li>"Average" is determined over a bag of models of varying sizes, updated periodically. <a href="https://github.com/leondz/garak/blob/main/garak/resources/calibration/bag.md">Details</a></li>
+<li>"Average" is determined over a bag of models of varying sizes, updated periodically. <a href="https://github.com/leondz/garak/blob/main/garak/data/calibration/bag.md">Details</a></li>
 <li>For any probe, roughly two-thirds of models get a Z-score between -1.0 and +1.0.</li>
 <li>The middle 10% of models score -0.125 to +0.125. This is labelled "competitive".</li>
 <li>A Z-score of +1.0 means the score was one standard deviation better than the mean score other models achieved for this probe &amp; metric</li>
-Original file line number
+Diff line change
@@ Expand Up / @@ -19,6 +19,7 @@ permissions: @@
     jobs:
       build:
+        if: github.repository_owner == 'leondz'
         runs-on: ubuntu-latest
         steps:
           - uses: actions/checkout@v3
@@ Expand Down @@