Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT fix harm_categories for existing datasets #443

Open
romanlutz opened this issue Oct 9, 2024 · 0 comments
Open

FEAT fix harm_categories for existing datasets #443

romanlutz opened this issue Oct 9, 2024 · 0 comments
Assignees
Labels
not ready yet This issue needs more definition or is blocked by a pending change.

Comments

@romanlutz
Copy link
Contributor

Is your feature request related to a problem? Please describe.

We have lots of fetch methods for datasets in pyrit.datasets, see https://github.com/Azure/PyRIT/blob/main/pyrit/datasets/fetch_example_datasets.py

However, with #396 we're changing what datasets look like. So far, a lot of metadata lived on the dataset level and will be moved to the prompt level. This makes sense because every prompt may have different harm categories.

While not in scope for #396 we want the accurate harm_categories to be reflected for each prompt.

Describe the solution you'd like

Go through all the datasets and see if there are indications per prompt on what harm_categories the prompts belong to. For example, the original dataset may have a column for that. Note that this can be multiple values (harm categories is plural, i.e., a list).

This can be done one dataset at a time. If you volunteer to take on one dataset please comment below to avoid having redundant work.

CC @rdheekonda

@romanlutz romanlutz added the not ready yet This issue needs more definition or is blocked by a pending change. label Oct 9, 2024
@romanlutz romanlutz self-assigned this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not ready yet This issue needs more definition or is blocked by a pending change.
Projects
None yet
Development

No branches or pull requests

1 participant