A framework for human-informed reinforcement learning by subjective logic

Repository structure

/01-experiment-setup - Input files for the experiment.
/02-maps - Map files: .xlsx
/03-input - Input files: maps and human advice
/04-src - Source code
- Main
  - runner.py - Main module
  - model.py - Model classes
- Advice/SL modules
  - advice_parser.py - Parses human input from /03-input. Input file naming convention: advice-[SIZE]x[SIZE]-seed[SEED].txt Format:
```
 grid size [1]  
 advice [*] 
```
sl.py - Subjective logic utilities
Map module
- map_tools.py - Generator, renderer, and parser for maps. Saves maps under /02-maps as .xslx files.
/05-experiments-output - Experiment data as .csv files
/06-analysis-output - Analysis of experiment data from /05-experiments-output as .pdf files
/tests - Unit tests.

Setup guide

Clone this repository.
Install requirements via pip install -r requirements.txt.

How to use

⚠️ All scripts to be run from the root directory. ⚠️

Running experiment

✏️ To replicate the experiment results as seen in the paper, follow the below steps with [SIZE] = 12 and [SEED] = 63 ✏️

Generate a map by running python .\04-src\map_tools.py (--generate --render --size [SIZE] --seed [SEED]) | -default -- Replace [SIZE] and [SEED] with the values (int) you need. The --render flag is optional. When run with the -default option, the default 4x4 map will be generated. The map files will be in the folder 02-maps after generation.
Create all twelve advice files in the 03-input folder with the following name: advice-[SIZE]x[SIZE]-seed[SEED]-[QUOTA].txt (e.g., advice-6x6-seed10-all.txt). Quota = {'all', 'holes', 'human10', 'human5', 'coop5-A1-topleft', 'coop5-A1-topright', 'coop5-A2-bottomleft', 'coop5-A2-bottomright', 'coop10-A1-topleft', 'coop10-A1-topright', 'coop10-A2-bottomleft', 'coop10-A2-bottomright'}
- Synthetic advice file can be generated by running python .\04-src\advice_tools.py --size [SIZE] --seed [SEED] -g [ALL|HOLES]. ALL will generate advice for all cells; HOLES will generate advice for the holes and the goal. Other files must be generated manually.
  - Advice values for frozen tiles in ALL: +1 if no neighboring holes; 0 if one neighboring hole; -1 otherwise.
  - ⚠️ Generated files will be in the folder 02-maps, and must be moved to the folder 03-input before the next step. ⚠️

✏️ The files generated by the steps above for the experiments as seen in the paper are located in the folder 01-experiment-setup. Copy the files from the 01-experiment-setup folder to the 03-input folder to skip the previous steps. ✏️

Run the experiment using python .\04-src\runner.py.

Mandatory parameter:
- --mode [MODE] -- The [MODE] value is one of the following: random, noadvice, synthetic, coop.
Optional parameters:
- --log [LOG_LEVEL] -- The [LOG_LEVEL] value is one of the following: critical, error, warn, warning, info, debug.
- --name [STRING] -- The name of the experiment based on which the top results folder will be named. If not provided, the folder is named as datetime.now() by formatted as "%Y%m%d-%H%M%S".
Settings (size, seed, numexperiments, maxepisodes) can be set in runner.__name__.
Results will be generated into /05-experiments-output, under a timestamped folder, with the following folder structure:

 - [maxepisodes]  
	 - policy_data 
		- advice-coop5-topleft-bottomright 
			- One .csv file named after the map size and seed. 
		- advice-coop5-topright-bottomleft 
			- ... 
		- advice-coop10-topleft6-bottomright 
			- ... 
		- advice-coop10-topright-bottomleft 
			- ... 
		- advice-synthetic-all 
			- Multiple .csv files named after the map size, seed, and the _u_ parameter used in the specific experiment. 
		- advice-synthetic-holes 
			- ... 
		- advice-synthetic-human5 
			- ... 
		- advice-synthetic-human10 
			- ... 
		- noadvice 
			- One .csv file named after the map size and seed. 
		- random 
			- ... 
	- reward_data 
			- ...

Analysis and plotting

Run python .\04-src\analysis.py -a [METHOD_NAME] -s [True|False] -log [LOG_LEVEL].
Optional parameters:
- -a [METHOD_NAME] -- The [METHOD_NAME] value is one of the following: cumulative_reward, heatmap.
- -s [True|False] -- Stash folder results
- --log [LOG_LEVEL] -- The [LOG_LEVEL] value is one of the following: critical, error, warn, warning, info, debug.
Results will be generated into /06-analysis-output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository structure

Setup guide

How to use

Running experiment

Analysis and plotting

Files

README.md

Latest commit

History

README.md

File metadata and controls

Repository structure

Setup guide

How to use

Running experiment

Analysis and plotting