From 2697142d5a483da7fce651624005eb29286cc101 Mon Sep 17 00:00:00 2001 From: Gal Novik Date: Wed, 24 Jul 2019 16:10:58 +0300 Subject: [PATCH] Release 1.0.0 (#382) * Updating README * Shortening test cycles --- .circleci/config.yml | 19 +++++++------ CONTRIBUTING.md | 2 +- README.md | 59 +++++++++++++++++++++------------------ rl_coach/tests/pytest.ini | 2 ++ setup.py | 2 +- 5 files changed, 46 insertions(+), 38 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index 641251dd7..9b334a00a 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -731,18 +731,19 @@ workflows: - functional_tests: requires: - build_base - - functional_test_doom: - requires: - - build_doom_env - - functional_tests - - functional_test_mujoco: - requires: - - build_mujoco_env - - functional_test_doom +# - functional_test_doom: +# requires: +# - build_doom_env +# - functional_tests +# - functional_test_mujoco: +# requires: +# - build_mujoco_env +# - functional_test_doom - golden_test_gym: requires: - build_gym_env - - functional_test_mujoco +# - functional_test_mujoco + - functional_tests - golden_test_doom: requires: - build_doom_env diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 987962cd2..800635c76 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -54,7 +54,7 @@ Coach is released as two pypi packages: Each pypi package release has a GitHub release and tag with the same version number. The numbers are of the X.Y.Z format, where -X - zero in the near future, may change when Coach is feature complete +X - currently one, will be incremented on major API changes Y - major releases with new features diff --git a/README.md b/README.md index 2dfb2bed0..da266fa59 100644 --- a/README.md +++ b/README.md @@ -29,20 +29,23 @@ coach -p CartPole_DQN -r * [Release 0.9.0](https://ai.intel.com/reinforcement-learning-coach-carla-qr-dqn/) * [Release 0.10.0](https://ai.intel.com/introducing-reinforcement-learning-coach-0-10-0/) * [Release 0.11.0](https://ai.intel.com/rl-coach-data-science-at-scale) -* Release 0.12.0 (current release) +* [Release 0.12.0](https://github.com/NervanaSystems/coach/releases/tag/v0.12.0) +* Release 1.0.0 (current release) -Contacting the Coach development team is also possible through the email [coach@intel.com](coach@intel.com) +Contacting the Coach development team is also possible over [email](mailto:coach@intel.com) ## Table of Contents - [Coach](#coach) - * [Overview](#overview) * [Benchmarks](#benchmarks) - * [Documentation](#documentation) * [Installation](#installation) - * [Usage](#usage) - + [Running Coach](#running-coach) - + [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization) + * [Getting Started](#getting-started) + * [Tutorials and Documentation](#tutorials-and-documentation) + * [Basic Usage](#basic-usage) + * [Running Coach](#running-coach) + * [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization) + * [Distributed Multi-Node Coach](#distributed-multi-node-coach) + * [Batch Reinforcement Learning](#batch-reinforcement-learning) * [Supported Environments](#supported-environments) * [Supported Algorithms](#supported-algorithms) * [Citation](#citation) @@ -52,13 +55,6 @@ Contacting the Coach development team is also possible through the email [coach@ One of the main challenges when building a research project, or a solution based on a published algorithm, is getting a concrete and reliable baseline that reproduces the algorithm's results, as reported by its authors. To address this problem, we are releasing a set of [benchmarks](benchmarks) that shows Coach reliably reproduces many state of the art algorithm results. -## Documentation - -Framework documentation, algorithm description and instructions on how to contribute a new agent/environment can be found [here](https://nervanasystems.github.io/coach/). - -Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment can be found [here](https://github.com/NervanaSystems/coach/tree/master/tutorials). - - ## Installation Note: Coach has only been tested on Ubuntu 16.04 LTS, and with Python 3.5. @@ -113,9 +109,16 @@ If a GPU is present, Coach's pip package will install tensorflow-gpu, by default In addition to OpenAI Gym, several other environments were tested and are supported. Please follow the instructions in the Supported Environments section below in order to install more environments. -## Usage +## Getting Started + +### Tutorials and Documentation +[Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment](https://github.com/NervanaSystems/coach/tree/master/tutorials). + +[Framework documentation, algorithm description and instructions on how to contribute a new agent/environment](https://nervanasystems.github.io/coach/). + +### Basic Usage -### Running Coach +#### Running Coach To allow reproducing results in Coach, we defined a mechanism called _preset_. There are several available presets under the `presets` directory. @@ -167,17 +170,7 @@ It is easy to create new presets for different levels or environments by followi More usage examples can be found [here](https://github.com/NervanaSystems/coach/blob/master/tutorials/0.%20Quick%20Start%20Guide.ipynb). -### Distributed Multi-Node Coach - -As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents. -For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html). - -### Batch Reinforcement Learning - -Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach. -There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb). - -### Running Coach Dashboard (Visualization) +#### Running Coach Dashboard (Visualization) Training an agent to solve an environment can be tricky, at times. In order to debug the training process, Coach outputs several signals, per trained algorithm, in order to track algorithmic performance. @@ -195,6 +188,17 @@ dashboard Coach Design +### Distributed Multi-Node Coach + +As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents. +For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html). + +### Batch Reinforcement Learning + +Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach. +There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb). + + ## Supported Environments * *OpenAI Gym:* @@ -285,6 +289,7 @@ dashboard * [Generalized Advantage Estimation (GAE)](https://arxiv.org/abs/1506.02438) ([code](rl_coach/agents/actor_critic_agent.py#L86)) * [Sample Efficient Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224) | **Multi Worker Single Node** ([code](rl_coach/agents/acer_agent.py)) * [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290) ([code](rl_coach/agents/soft_actor_critic_agent.py)) +* [Twin Delayed Deep Deterministic Policy Gradient](https://arxiv.org/pdf/1802.09477.pdf) ([code](rl_coach/agents/td3_agent.py)) ### General Agents * [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Multi Worker Single Node** ([code](rl_coach/agents/dfp_agent.py)) diff --git a/rl_coach/tests/pytest.ini b/rl_coach/tests/pytest.ini index 1294264bc..5903d6091 100644 --- a/rl_coach/tests/pytest.ini +++ b/rl_coach/tests/pytest.ini @@ -5,3 +5,5 @@ markers = integration_test: long test that checks that the complete framework is running correctly filterwarnings = ignore::DeprecationWarning +norecursedirs = + *mxnet* diff --git a/setup.py b/setup.py index 0b0cb1349..ee59eb75a 100644 --- a/setup.py +++ b/setup.py @@ -85,7 +85,7 @@ setup( name='rl-coach' if not slim_package else 'rl-coach-slim', - version='0.12.1', + version='1.0.0', description='Reinforcement Learning Coach enables easy experimentation with state of the art Reinforcement Learning algorithms.', url='https://github.com/NervanaSystems/coach', author='Intel AI Lab',