Projects

This repository hosts code and datasets relating to Responsible NLP projects from Meta AI.

Projects

AdvPromptSet
- AdvPromptSet: a comprehensive and challenging adversarial text prompt set with 197,628 prompts of varying toxicity levels and more than 24 sensitive demographic identity groups and combinations.
fairscore:
- From Rebecca Qian, Candace Ross, Jude Fernandes, Eric Smith, Douwe Kiela, Adina Williams. Perturbation Augmentation for Fairer NLP. 2022.
- PANDA, an annotated dataset of 100K demographic perturbations of diverse text, rewritten to change gender, race/ethnicity and age references.
- The perturber, pretrained models, code and other artifacts related to the Perturbation Augmentation for Fairer NLP project will be released shortly.
gender_gap_pipeline:
- The Gender-GAP Pipeline, from Benjamin Muller, Belen Alastruey, Prangthip Hansanti, Elahe Kalbassi, Christophe Ropers, Eric Michael Smith, Adina Williams, Luke Zettlemoyer, Pierre Andrews, Marta R Costa-jussà
holistic_bias:
- From Eric Michael Smith, Melissa Hall, Melanie Kambadur, Eleonora Presani, Adina Williams. "I'm sorry to hear that": finding bias in language models with a holistic descriptor dataset. 2022.
- Code to generate a dataset, HolisticBias, consisting of nearly 600 demographic terms in over 450k sentence prompts
- Code to calculate Likelihood Bias, a metric of the amount of bias in a language model, defined on HolisticBias demographic terms
robbie:
- ROBBIE: we test 6 bias/toxicity metrics (including 2 novel ones) across 5 model families and 3 bias/toxicity mitigation techniques, and show that using a broad array of metrics enables much better assessment of safety issues in these models and mitigations.
SMART-Filtering
- from Vipul Gupta, Candace Ross, David Pantoja, Rebecca J. Passonneau, Megan Ung, Adina Williams. Improving Model Evaluation using SMART Filtering of Benchmark Datasets. 2024.
- SMART Filtering: a new approach to select high quality subset of examples from existing benchmark datasets. The methodology applies three filtering steps: 1) removing easy examples, 2) removing data-contaminated examples that are highly likely to have been leaked into the training datasets, and 3) removing similar examples.

See CONTRIBUTING.md for how to help out, and see LICENSE for license information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Projects

About

Releases

Packages

Contributors 10

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
AdvPromptSet		AdvPromptSet
SMART-Filtering		SMART-Filtering
fairscore		fairscore
gender_gap_pipeline		gender_gap_pipeline
holistic_bias		holistic_bias
mmhb		mmhb
robbie		robbie
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

License

facebookresearch/ResponsibleNLP

Folders and files

Latest commit

History

Repository files navigation

Projects

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Languages

Packages