LLM Multiagent Debate WebUI

This is a Web UI implementation of the paper "Improving Factuality and Reasoning in Language Models through Multiagent Debate", designed to make it easier to use and better understanding.

I aimed to keep the original code consistent with the content of the paper. Therefore, I only updated the OpenAI API code and added bridge functions for communication with the Gradio UI.

# Set "OPENAI_API_KEY" environment key

python main.py

# Running on local URL:  http://127.0.0.1:7860

WebUI

Main

You can choose a LLM model

Simple Math Demo

Graduate Student Math Demo

Massive Multitask Language Understanding Demo

Biography Demo

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Project Page | Paper

Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch

This is a preliminary implementation of the paper "Improving Factuality and Reasoning in Language Models through Multiagent Debate". More tasks and settings will be released soon. You may see some additional debate logs here.

Also, check out gauss5930's awesome implementation of multiagent debate on opensource LLMs here!

Running experiments

The code for running arithmetic, GSM, biographies, and MMLU tasks may be found in the following subfolders

./math/ contains code for running math
./gsm/ contains code for running gsm
./biography/ contains code for running biographies
./mmlu/ contains code for running mmlu results.

Math:

To generate and evaluated answer for Math problems through multiagent debate, cd into the math directory and run: python gen_math.py

Grade School Math:

To generate answers for Grade School Math problems through multiagent debate, cd into the gsm directory and run: python gen_gsm.py

To evaluate the generated results of Grade School Math problems: python eval_gsm.py

You can download the GSM dataset here

Biography:

To generate answers for Biography problems through multiagent debate, cd into the biography directory and run: python gen_conversation.py

To evaluate the generated results for Biography problems: python eval_conversation.py

MMLU:

To generate answers for MMLU through multiagent debate, cd into the MMLU directory and run: python gen_mmlu.py

To evaluate the generated results of MMLU: python eval_mmlu.py

You can download the MMLU dataset here

If you would like to cite the paper, here is a bibtex file:

@article{du2023improving,
  title={Improving Factuality and Reasoning in Language Models through Multiagent Debate},
  author={Du, Yilun and Li, Shuang and Torralba, Antonio and Tenenbaum, Joshua B and Mordatch, Igor},
  journal={arXiv preprint arXiv:2305.14325},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
biography		biography
docs		docs
gsm		gsm
mmlu		mmlu
simplemath		simplemath
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Multiagent Debate WebUI

This is a Web UI implementation of the paper "Improving Factuality and Reasoning in Language Models through Multiagent Debate", designed to make it easier to use and better understanding.

WebUI

Main

Simple Math Demo

Graduate Student Math Demo

Massive Multitask Language Understanding Demo

Biography Demo

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Project Page | Paper

Running experiments

About

Releases

Packages

Languages

bongsang/llm_multiagent_debate

Folders and files

Latest commit

History

Repository files navigation

LLM Multiagent Debate WebUI

This is a Web UI implementation of the paper "Improving Factuality and Reasoning in Language Models through Multiagent Debate", designed to make it easier to use and better understanding.

WebUI

Main

Simple Math Demo

Graduate Student Math Demo

Massive Multitask Language Understanding Demo

Biography Demo

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Project Page | Paper

Running experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages