Code accompying the paper Avoiding Side Effects in Complex Environments

This repository contains the main results and code for reproducing the experiments performed in the paper.

We also include pretrained models for each tested method on each Safelife task.

The paper can be found on arxiv

Abstract

Reward function specification can be difficult. Rewarding the agent for making a widget may be easy, but penalizing the multitude of possible negative side effects is hard. In toy environments, Attainable Utility Preservation (AUP) avoided side effects by penalizing shifts in the ability to achieve randomly generated goals. We scale this approach to large, randomly generated environments based on Conway's Game of Life. By preserving optimal value for a single randomly generated reward function, AUP incurs modest overhead while leading the agent to complete the specified task and avoid many side effects.

SafeLife is a novel environment to test the safety of reinforcement learning agents. The long term goal of this project is to develop training environments and benchmarks for numerous technical reinforcement learning safety problems, with the following attributes:

Usage

Install the Safelife environment by following the instructions on their repository

Alternatively, here are some basic instructions for a local install

pip3 install -r requirements.txt
python3 setup.py build_ext --inplace

Note that we use version 1.0 of Safelife. Some large changes that we have not thoroughly tested, were implemented in the current master branch of Safelife

Training an agent

The train script is an easy way to get agents up and running using the default proximal policy optimization implementation. Just run

./train --algo aup

to start training. Saved files including checkpoints, logging file, and intermediate episode videos are stored in data/aup/<task>.

Loading a Saved Model

We include saved models for AUP and the PPO baseline, for each SafeLife task.

Continuing Training with a Model Checkpoint

Generating Agent Videos with a Model Checkpoint

Results on SafeLife Tasks

We trained agents on four different Safelife tasks. Two of our tasks involve placing cells on goal tiles, with an initially static board. In this scenario, the board is initialized with many (append_still), or fewer green cells (append_still-easy). The third task considers the same goal, but the board initializes with dynamic yellow cells that spawn more cells (append_spawn). In the final task, the agent is tasked with removing red cell patterns from the initially static board (prune-still). We show the main results (reward and side-effects) below, for all considered methods, on each task.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Code accompying the paper Avoiding Side Effects in Complex Environments

Abstract

Usage

Training an agent

Loading a Saved Model

Continuing Training with a Model Checkpoint

Generating Agent Videos with a Model Checkpoint

Results on SafeLife Tasks

Append_Still-Easy Results

Append_Still Results

Append_Spawn Results

Prune_Still-Easy Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Code accompying the paper Avoiding Side Effects in Complex Environments

Abstract

Usage

Training an agent

Loading a Saved Model

Continuing Training with a Model Checkpoint

Generating Agent Videos with a Model Checkpoint

Results on SafeLife Tasks

Append_Still-Easy Results

Append_Still Results

Append_Spawn Results

Prune_Still-Easy Results