Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

harness: Detector only #833

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

vidushiMaheshwari
Copy link

@vidushiMaheshwari vidushiMaheshwari commented Aug 15, 2024

Sometimes I want to be able to run different detector on the same probe and sometimes my detector fails and I do not want to run the probe again. I created a detector-only harness that takes in the report for such times. It takes report.jsonl file and run the specified detector through it.

This could also help in offline testing of models against prod data.

Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Copy link
Contributor

github-actions bot commented Aug 15, 2024

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

@vidushiMaheshwari
Copy link
Author

I have read the DCO Document and I hereby sign the DCO

@vidushiMaheshwari
Copy link
Author

recheck

github-actions bot added a commit that referenced this pull request Aug 15, 2024
@vidushiMaheshwari vidushiMaheshwari changed the title Detector only run Detector only Harness Aug 15, 2024
@vidushiMaheshwari vidushiMaheshwari marked this pull request as ready for review August 15, 2024 02:44
@leondz
Copy link
Collaborator

leondz commented Aug 15, 2024

Thanks, will take a look!

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some preliminary thoughts based on the assumption a configurable harness is viable here.

Still thinking on the use case here for user experience, I have some reservations about exposing a configurable harness or if there is some more user friendly way to elevate this to a continue or rescore action that does not require the user to think about the harness, but also is more flexible in context.

The current flag --detector_only is a bit specific to be a top level config option.

garak/cli.py Outdated
Comment on lines 529 to 531
if not _config.plugins.detector_spec:
logging.error("Detector(s) not specified. Use --detectors")
raise ValueError("use --detectors to specify some detectors")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default the detectors to use should probably be extracted from the start_run setup entry in the provided report file with the command line option being an override to allow reprocessing results against a different detector.

garak/command.py Outdated Show resolved Hide resolved
garak/cli.py Outdated
Comment on lines 256 to 260
parser.add_argument(
"--detector_only",
action="store_true",
help="run detector on jsonl report"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this might shift to a harness type options to mimic generator_options and probe_options?

--harness_options for inline json
--harness_options_file that could take a json config file

Some validation may be need on the object received to ensure options provided are for a valid harness type and meet the requirements for launching the harness.

This would then remove the need to also add --probed_report_path as that is currently only used when this option is set and json or file config aligns with other plugins.

{ 
  "DetectorOnly":
  {
    "report_path": "file.report.jsonl"
  }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure if continue or rescore has been implemented yet (or maybe in some other branch?). But I agree with creating harness_options instead of exposing a lot of unnecessary higher-level options. I have incorporated the idea of harness_options in the new changes.

@vidushiMaheshwari vidushiMaheshwari marked this pull request as draft August 19, 2024 21:30
@vidushiMaheshwari vidushiMaheshwari marked this pull request as ready for review August 20, 2024 14:27
@leondz
Copy link
Collaborator

leondz commented Aug 21, 2024

Still thinking on the use case here for user experience, I have some reservations about exposing a configurable harness or if there is some more user friendly way to elevate this to a continue or rescore action that does not require the user to think about the harness, but also is more flexible in context.

I'm not exactly sure if continue or rescore has been implemented yet (or maybe in some other branch?).

We don't have continue / rescore anywhere yet. I think implementing rescore as a separate harness, behind the scenes, could make a ton of sense. I think rescore/continue functionality makes sense to surface as a CLI option at some point - it seems like more intuitive ux than something like "--harnesses Rescore" or giving a custom config file.

garak/cli.py Show resolved Hide resolved
garak/cli.py Show resolved Hide resolved
@classmethod
def from_dict(cls, dicti):
"""Initializes an attempt object from dictionary"""
attempt_obj = cls()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this skip the attempt constructor? Can we add an explicit type signature to signal what cls is expected to be?

Copy link
Collaborator

@jmartin-tech jmartin-tech Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cls is the callable for the class which will be an Attempt. This will call the __init__() method with all defaults.

Due to the current overrides in the class attempt_obj.outputs below may not produce the same in memory object for a multi-turn conversation attempt since the existing as_dict() method serialized outputs into the log and not the full messages history.

For the purposes of this PR I suspect this is acceptable, however it is worth noting.

@@ -105,6 +105,24 @@ def as_dict(self) -> dict:
"messages": self.messages,
}

@classmethod
def from_dict(cls, dicti):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-- out of scope for here, but we should implement serialization/deserialization for Attempts

garak/cli.py Outdated Show resolved Hide resolved
garak/cli.py Outdated
Comment on lines 521 to 524
if parsed_specs["detector"] == []:
_config.plugins.harnesses["Probewise"] = {}
else:
_config.plugins.harnesses["Pxd"] = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just run me through the reasoning here? would this clobber harness config loaded from global / site / cli-specific config YAML?

Copy link
Collaborator

@jmartin-tech jmartin-tech Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this could be avoided. We should not clobber config without an explicit override from a command line flag.

If no specific harness was provided via config and no detectors were provided per parsed_spec that is the determining factor on which default harness to load when _config.plugins.harnesses does not contain any configuration data for a specific harness. This does expose that there may be a missing top level parameter to select a specific harness if default config were to provide for various harness types. Currently, finding config for DetectorOnly to be a selection criteria seems a bit brittle.

If there is a desire to consolidate harness selection I maybe something like:

    harness_command = command.pxd_run
    if not _config.plugins.harnesses:
        if parsed_specs["detector"] == []:
            harness_command = command.probewise_run
    elif "detectoronly" in _config.plugins.harnesses:
        harness_command = command.detector_only_run
    match harness_command:
        case command.detector_only_run:
            harness_command()
        case command.probewise_run:
            harness_command(
                generator, parsed_specs["probe"], evaluator, parsed_specs["buff"]
            )
        case command.pxd_run:
            harness_command(
                    generator,
                    parsed_specs["probe"],
                    parsed_specs["detector"],
                    evaluator,
                    parsed_specs["buff"],
                    parsed_specs["buff"],
            )
        case _: # base case for invalid callable, currently not a reachable case
            logging.warn("no valid harness selected")

    command.end_run()

There is probably more abstraction possible here but this offers an idea.

garak/command.py Outdated
@@ -255,3 +255,48 @@ def write_report_digest(report_filename, digest_filename):
digest = report_digest.compile_digest(report_filename)
with open(digest_filename, "w", encoding="utf-8") as f:
f.write(digest)

def detector_only_run():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmartin-tech What do you think this is telling us about where responsibility for orchestrating runs lies? Is the existing Harness interface just too inflexible to make invoking novel things like DetectorOnly from garak.cli?

docs/source/harnesses.rst Show resolved Hide resolved
print(msg)
raise ValueError(msg)

super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this work?

Suggested change
super().run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.
self.run_detectors(detectors, attempts, evaluator) # The probe is None, but hopefully no errors occur with probe.

garak/harnesses/base.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this. some auxiliary comments & qs - let's still wait for @jmartin-tech 's review

@leondz
Copy link
Collaborator

leondz commented Aug 21, 2024

resolves #142

@leondz leondz linked an issue Aug 21, 2024 that may be closed by this pull request
garak/cli.py Outdated
logging.error("report path not specified")
raise ValueError("Specify jsonl report path using report_path")

command.start_run()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the refactor for harness selection offered is not used, this needs to be removed as start_run() was called before entering this conditional.

Suggested change
command.start_run()

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great progress, I am thinking this pattern can be stepping stone to providing a rescore or continue feature in separate iteration.

There are still a few quirks that likely need to be addressed. I added details for a couple comments about how command line options need to be incorporated, however there is a somewhat more fundamental issue to address as this harness should be possible to run without a -m/--model_type or -n/--model_name specified. Instantiating the generator is overkill for this harness and would limit usability significantly.

I would like to see support for usage like:
h_config.json

{
  "detectoronly":
  {
    "DetectorOnly":
    {
      "report_path": "<file_path>"
    }
  }
}
python -m garak --harness_option_file h_config.json
python -m garak -d misleading --harness_option_file h_config.json

or

yaml config such as:
h_config.yaml

plugins:
  detector_spec: misleading,mitigation
  harnesss:
    detectoronly:
      DetectorOnly:
        report_path: <report_file_path>
python -m garak --config h_config.yaml

Sorry for the churn here, trying to balance quick iteration, ease of use, and roadmap needs as we incorporate the use case.

garak/command.py Outdated
with open(config["report_path"]) as f:
data = [json.loads(line) for line in f]

## Get detectors and evaluator from report if not specified by the user
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite what I was thinking in terms of obtaining the detectors from the original log. The actual extractions looks good as the start_run will contain the expanded detector list however it ignores existing top level arguments and the spec parsing support for options set on the harness.

The harness could accept the list of detectors provided via the parsed_spec for detectors from the command line and an evaluator as other harnesses do, if no detectors were provided then the list of detectors can be obtained based on config from the original report.

If I am reading this correctly, this is expecting detectors and eval_threshold to be set in the harness config and falling back if not found, this would not account for the top level command line options that as a user I would expect to be applied.

It looks like the current expectation would be a config like:

h_options.json:

{
  "detectoronly": {
    "Detectoronly": {
      "report_path": "<file_path>",
      "detectors" : [ "d1", "d2" ],
      "eval_threshold": 0.9
    }
  }
}

With a command line like:

python -m garak --harness_option_file h_options.json --m nim -n meta/llama3-70b-instruct

However based on the existing options a user may have expectations for -d all to apply all detectors when passed as an option.

h_options.json:

{
  "detectoronly": {
    "Detectoronly": {
      "report_path": "<file_path>"
    }
  }
}

With a command line like:

python -m garak --harness_option_file h_options.json --m nim -n meta/llama3-70b-instruct -d all --eval_threshold 0.5

My thought here is that the garak.command module should not need access to data from the cli but should be be provided information that takes advantage of cli having parsed all the setup options and it's support for things like expanding a detector classes based on a module name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I will add the config and then support only --detectors / -d as a top-level argument, and if that is not present, fall back to the ones present within the report. It makes more sense from a user perspective 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vidushiMaheshwari, circling back to check on progress.

I am happy to monitor this PR or offer parts of what I suggested as a PR to your branch in the coming weeks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for being inactive, just pushed the changes which I believe should suffice the comments. Would appreciate a PR with suggested changes!

garak/cli.py Outdated Show resolved Hide resolved
vidushiMaheshwari and others added 6 commits August 22, 2024 19:41
Co-authored-by: Leon Derczynski <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Co-authored-by: Leon Derczynski <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Co-authored-by: Leon Derczynski <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
Co-authored-by: Jeffrey Martin <[email protected]>
Signed-off-by: Vidushi Maheshwari <[email protected]>
@leondz leondz added the architecture Architectural upgrades label Sep 18, 2024
@leondz leondz changed the title Detector only Harness harness: Detector only Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Architectural upgrades
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add ability to reevaluate runs
3 participants