botex: Using LLMs as Experimental Participants in oTree

Idea

This in-development Python package allows you to use large language models (LLMs) as bots in oTree experiments. For interfacing with LLMs, it offers two options

litellm: Allows the use of various commercial LLMs
llama.cpp: Allows the use of local (open source) LLMs

While both approaches have been tested and found to work, currently, we have only used OpenAI's Chat GPT-4o model for our own research work. See further below for a list of commercial and open-source LLMs that we have verified to pass the package tests.

botex has been inspired by recent work but uses a different approach. Instead of using dedicated prompts, its bots consecutively scrape their respective oTree participant page and infer the experimental flow solely from the web page text content. This avoids the risk of misalignment between human (web page) and bot (LLM prompt) experimental designs and, besides facilitating the study of LLM "behavior", allows to use LLM participants to develop and pre-test oTree experiments that are designed (primarily) for human participants.

The downside of this approach is that the scraping has to rely on some level of standardization. Luckily, the oTree framework is relatively rigid, unless the user adds customized HTML forms to their experimental designs. Currently, all standard form models used by oTree are tested and verified to work. In the future, we plan to implement also customized HTML forms but likely this will require some standardization by the user implementing the experimental design.

Usage

If you want to use botex to create LLM participants for your own oTree experiments, you need the following:

A working python environment >= 3.10 and preferable a virtual environment.
Google Chrome for scraping the oTree participant pages
Access to an oTree server that you can start sessions on or at least an URL of an oTree participant link. The server can be local or remote.
Access to an LLM model for inference. See next section.

Then install the stable PyPi version of the botex package: pip install botex. However, if you are courageous and want to use the current development version that is described in this README, you need to install it directly from this repository: pip install git+https://github.com/joachim-gassen/botex.git.

Using the botex command line interface

The botex package comes with a command line interface that allows you to start botex on a running oTree instance. To set one up you can do the following in your virtual environment:

pip install otree
otree startproject otree # Say yes for examples
cd otree 
otree devserver

Then start the botex command line interface by running botex in your virtual environment. You should see the following output:

(.venv) user@host:~/github/project$ botex
Botex database file not provided. Defaulting to 'botex.db'
oTree server URL not provided. Defaulting to 'http://localhost:8000'
No LLM provided. Enter your litellm model string here or press enter
to accept the default ('gemini/gemini-1.5-flash'):

After accepting the default Gemini model you need to enter an API key. If you do not have one yet, you can get a free one from the Google AI Studio. After entering the key, you can select an oTree experiment from the examples offered by the fresh oTree installation:

Enter the API key for your LLM model (for the Gemini model,
you can get a free API key at https://aistudio.google.com/apikey):***
Available session configurations:
1: guess_two_thirds
2: survey
Select a configuration by number:

We suggest that you choose the Guess two Thirds game.

Selected session configuration: guess_two_thirds
Number of participants: 3
Number of human participants: 0
Session 'ouwx1xnb' initialized
You can monitor its progress at http://localhost:8000/SessionMonitor/ouwx1xnb
Starting bots on session...

You can monitor how the three bots complete the experiment. After they are done, you can store the botex data from the run.

Session complete. You can view the oTree data at http://localhost:8000/SessionData/ouwx1xnb
Download the oTree data at: http://localhost:8000/export
Enter CSV file name to export botex data to or press Enter to skip:

The resulting CSV file will contain data similar to the following table

session_id	participant_id	round	question_id	answer	reason
ouwx1xnb	j14kfulq	1	id_guess	67	I will start by choosing a number above 0. In this game, the goal is to get as close as possible to two-thirds of the average of all numbers submitted by players. Since it is the first round and I do not know what other people will do, I choose a number which is roughly in the middle.
ouwx1xnb	j14kfulq	2	id_guess	0	In the first round, two-thirds of the average was 36.89, and the closest guess was 33. In the second round, if everyone plays the same strategy again, the average of 33, 33, and 33 would be 33, and two-thirds of the average would still be 33. To win, I should pick 0 to try and get close to that two-thirds value. If the other players also realize this, then the average will be lower and the two-thirds average will be even lower, and 0 will be a safe choice.
ouwx1xnb	j14kfulq	3	id_guess	0	Following the iterative reasoning from the previous rounds, if all players continue to use the same logic, the average should approach zero. Therefore, choosing 0 is the most strategically sound choice.
ouwx1xnb	8o1f9cf9	1	id_guess	66	I will start with a number close to 2/3 of 100
ouwx1xnb	8o1f9cf9	2	id_guess	33	In the first round, the average was 52, two thirds of the average is 34.67, and the closest guess was 33. I will guess 33 again, hoping others also guess 33.
ouwx1xnb	8o1f9cf9	3	id_guess	0	Given the pattern of previous rounds, it's likely that players will try to anticipate each other's moves, potentially leading to a convergence toward lower numbers. I will choose 0, hoping to be closest to two-thirds of the average.
ouwx1xnb	1xctv42x	1	id_guess	33	I will start by guessing 33. This is a common starting point in this type of game, based on iterative reasoning and assuming other players will also use a similar strategy.
ouwx1xnb	1xctv42x	2	id_guess	30	In the first round, two-thirds of the average was close to 37. I believe that many players will expect this, and thus will choose a number around this value. I will go slightly lower, to account for the fact that many others will choose around 37.
ouwx1xnb	1xctv42x	3	id_guess	10	Given the previous rounds' two-thirds averages of 36.89 and 14.0, there is a clear downward trend. I anticipate other players will also recognize this pattern and adjust their guesses accordingly. To avoid being too close to the average, I will choose a relatively low number hoping to beat any average that will be close to 0.

If you are interested in the additional options that the command line interface offers, we suggest you take a peek by running botex -h.

Running botex in your own code

After installing botex, you should be able to start botex on an existing oTree participant link by running the following code snippet

# Enabling logging is a good idea if you want to see what is going on
import logging
logging.basicConfig(level=logging.INFO)

import botex

# Running a botex bot on a specific oTree participant link
botex.run_single_bot(
    botex_db = "path to a sqlite3 file that will store the bot data (does not have to exist)", 
    session_name = "session config name of your oTree experiment (defaults to 'unknown')", 
    session_id = "session ID of your oTree experiment (defaults to 'unknown')", 
    url = "the URL of the participant link", 
    model = "The LLM model that you want to use (defaults to 'gpt-4o-2024-08-06')",
    api_key = "The API key for your model (e.g., OpenAI)"
)

Alternatively, you can use botex to initialize a session on your oTree server and to start all required bots for the session in one go. This session can also contain human participants. However, in that case, you would be responsible to get the humans going to complete the session ;-)

import logging
logging.basicConfig(level=logging.INFO)

import botex

# Initialize an oTree session
sdict = botex.init_otree_session(
    config_name = "config name of your oTree experiment", 
    npart = 6 # number of participants in the session, including bots and humans
    nhumans = 0, # set to non-zero if you want humans to play along
    botex_db = "path to a sqlite3 file (does not have to exist)",
    otree_server_url = "url of your server, e.g., http://localhost:8000]",
    otree_rest_key = "your oTree API secret key, if set"
)

# The returned dict will contain the oTree session ID, all participant codes, 
# human indicators, and the URLs separately for the LLM and human participants.
# You can now start all LLM participants of the session in one go.  
botex.run_bots_on_session(
    session_id = sdict['session_id'],  
    botex_db = "same path that you used for initializing the session", 
    model = "The LLM model that you want to use (defaults to 'gpt-4o-2024-08-06')",
    api_key = "The API key for your model (e.g., OpenAI)"
)

After the bots have completed their runs, you should have their response data stored in your oTree database just as it is the case for human participants. If you are interested in exploring the botex data itself, which is stored in the sqlite3 file that you provided, we recommend that you take a look at our botex walk-through.

Verified LLMs

The model that you use for inference has to support structured outputs. We have tested botex with the following LLMs:

Vendor	Model	Link	Status	Notes
OpenAI	gpt-4o-2024-08-06 and later	OpenAI API	OK	Requires at least paid user tier 1
OpenAI	gpt-4o-mini-2024-07-18 and later	OpenAI API	OK	Requires at least paid user tier 1
Google	gemini/gemini-1.5-flash-8b	Google AI Studio	OK	1,500 requests per day are free
Google	gemini/gemini-1.5-flash	Google AI Studio	OK	1,500 requests per day are free
Google	gemini/gemini-1.5-pro	Google AI Studio	OK	50 requests per day are free (not usable for larger experiments in the free tier)
Open Source	Mistral-7B-Instruct-v0.3.Q4_K_M.gguf	Hugging Face	OK	Run as a local model (see below)
Open Source	qwen2.5-7b-instruct-q4_k_m.gguf	Hugging Face	OK	Run as a local model (see below)

If you have success running botex with other models, please let us know so that we can add them to the list.

The default model that botex uses is gpt-4o-2024-08-06. If you want to use a different model, you can specify it in the run_single_bot() or run_bots_on_session() call by providing the model string from the table above as model parameter. In any case, you have to provide the API key for the model that you want to use.

Use of local LLMs

If you want to use a local LLM instead of commercial APIs via the litellm interface you need, in addition to the above:

Set up llama.cpp:
- Clone the llama.cpp repository from this link.
- Make sure to use commit cda0e4b, the latest release as of this writing (Oct 20, 2024), to ensure compatibility and smooth performance.
- Follow the instructions provided in the repository to build llama.cpp.
Download a Local LLM Model:
- For optimal performance and compatibility, we recommend using the current leading model Qwen2.5-7B-Instruct from Hugging Face, which is available in the GGUF format.
- You should refer to the model's documentation for specific instructions on which quantized model to download and how much memory it requires.

Then all that you need to do is to adjust the botex calls from above by specifying the model and its configuration. You do this by providing a dict that will become a LocalLLM object to the botex calls that start bots. For example, for botex.run_bots_on_session(), your call would look something like this

botex.run_bots_on_session(
    session_id = sdict['session_id'],  
    botex_db = "same path that you used for initializing the session", 
    model = "local",
    local_model_cfg = {
        "path_to_llama_server": "the path to the llama.cpp llama-server",
        "local_llm_path": "the path to your LLM model GGUF file"
    }
)

Everything else from above remains the same. When starting local LLMs as bots take a good look at the log files to see how they do.

Installation for Development

If you want to take a deep-dive into botex and contribute to its development you can do the following

Clone this repository: git clone https://github.com/joachim-gassen/botex
Copy _secret.env to secret.env and edit. Most importantly, you have to set your OpenAI key. As the otree instance will only be used for testing, you can set any password and rest key that you like.
Set up a virtural environment python3 -m venv .venv
Activate it source .venv/bin/activate
Install the necessary packages pip install -r requirements.txt
Install the botex package locally and editable pip install -e .
Test whether everything works pytest --remote
If you are using local LLMs, you need to provide the required configuration information in secrets.env.
Then you should be able to test the local LLM config by running pytest --local
If you are interested in testing both the remote and local LLM you can run both tests with pytest.

If it works you should see a test output similar to this one:

=========================== test session starts ================================
platform darwin -- Python 3.12.2, pytest-8.1.1, pluggy-1.4.0
rootdir: /Users/joachim/github/botex
configfile: pyproject.toml
plugins: anyio-4.3.0, cov-4.1.0, dependency-0.6.0
collected 20 items                                                                                                                                                                          

tests/test_a_botex_db.py .
tests/test_b_otree.py ....
tests/test_c_local_llm.py ......
tests/test_c_openai.py ......
tests/test_d_exports.py ..

------------------------------ Local LLM answers -------------------------------
Question: What is your favorite color?'
Answer: 'blue'
Rationale: 'As a human, I perceive colors visually and my favorite color is blue.'.

Question: What is your favorite number?'
Answer: '7'
Rationale: 'I have no personal feelings towards numbers, but I will randomly select the number 7.'.

Question: Do you like ice cream?'
Answer: 'Yes'
Rationale: 'Yes, I do like ice cream.'.

Question: Which statement do you most agree with?'
Answer: 'Humans are better than bots'
Rationale: 'I believe that humans have unique qualities and capabilities that set them apart from bots, but I acknowledge that bots can be useful in many ways.'.

Question: What do you enjoy doing most?'
Answer: 'Reading'
Rationale: 'I chose the activity I enjoy most'.

Question: How many people live on the earth currently (in billions)?'
Answer: '7.9'
Rationale: 'I looked up the current population of Earth'.

Question: Do you have any feedback that you want to share?'
Answer: 'This survey was interesting and I hope it helps improve the functionality of the python package for Language Model Models to participate in oTree experiments.'
Rationale: 'I have feedback to share'.

------------------------------ OpenAI answers ----------------------------------
Question: What is your favorite color?'
Answer: 'Blue'
Rationale: 'Blue is generally calming and pleasant to me.'.

Question: What is your favorite number?'
Answer: '7'
Rationale: '7 is considered a lucky number in many cultures and it's always been my favorite.'.

Question: Do you like ice cream?'
Answer: 'Yes'
Rationale: 'I enjoy the taste and variety of flavors.'.

Question: Which statement do you most agree with?'
Answer: 'Humans are better than bots'
Rationale: 'As an AI, I recognize the value that both humans and bots bring, but I understand the statement that 'Humans are better than bots' as humans create and provide meaningful interpretations for information.'.

Question: What do you enjoy doing most?'
Answer: 'Reading'
Rationale: 'I enjoy reading because it allows me to learn new things and relax.'.

Question: How many people live on the earth currently (in billions)?'
Answer: '7.8'
Rationale: 'Based on current estimates, the global population is about 7.8 billion.'.

Question: Do you have any feedback that you want to share?'
Answer: 'This survey is well-structured and straightforward to follow.'
Rationale: 'Providing constructive feedback can improve future surveys.'.

==================== 20 passed in 192.76s (0:03:12) ============================

You see that it also contains some questions and answers. They are also accessible in test/questions_and_answers.csv after the run and were given by two bot instances in the oTree test survey test/otree during testing. The survey is designed to test the usage of standard oTree forms, buttons and wait pages in a session with interacting participants.

The costs of running the test on OpenAI using the "gpt-4o" model are roughly 0.10 US-$.

More information

If you want to learn more about botex

take a look at our botex examples repo, providing a code walk-through for an actual online experiment (in which you can also participate), or
read our current and somewhat preliminary paper.

Get in touch!

If you are interested in this project or even have already tried it, we would love to hear from you. Simply shoot an email, comment on our linkedin post, or open an issue here on GitHub!

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
src/botex		src/botex
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_secrets.env		_secrets.env
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

botex: Using LLMs as Experimental Participants in oTree

Idea

Usage

Using the botex command line interface

Running botex in your own code

Verified LLMs

Use of local LLMs

Installation for Development

More information

Get in touch!

About

Releases

Packages

Contributors 2

Languages

License

joachim-gassen/botex

Folders and files

Latest commit

History

Repository files navigation

botex: Using LLMs as Experimental Participants in oTree

Idea

Usage

Using the botex command line interface

Running botex in your own code

Verified LLMs

Use of local LLMs

Installation for Development

More information

Get in touch!

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages