Skip to content

Commit

Permalink
Merge branch 'main' into fix/tsp-rendering
Browse files Browse the repository at this point in the history
  • Loading branch information
sash-a authored Nov 1, 2024
2 parents 9186861 + 85333d7 commit 659b07c
Show file tree
Hide file tree
Showing 34 changed files with 3,068 additions and 25 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
</div>
<div class="row" align="center">
<img src="docs/env_anim/multi_cvrp.gif" alt="MultiCVRP" width="16%">
<img src="docs/env_anim/pac_man.gif" alt="PacMan" width="16%">
<img src="docs/env_anim/pac_man.gif" alt="PacMan" width="12.9%">
<img src="docs/env_anim/robot_warehouse.gif" alt="RobotWarehouse" width="16%">
<img src="docs/env_anim/rubiks_cube.gif" alt="RubiksCube" width="16%">
<img src="docs/env_anim/sliding_tile_puzzle.gif" alt="SlidingTilePuzzle" width="16%">
Expand All @@ -50,6 +50,7 @@
<img src="docs/env_anim/sudoku.gif" alt="Sudoku" width="16%">
<img src="docs/env_anim/tetris.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/tsp.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/lbf.gif" alt="Level-Based Foraging" width="16%">
</div>
</div>

Expand Down Expand Up @@ -121,6 +122,7 @@ problems.
| Multi Minimum Spanning Tree Problem | Routing | `MMST-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/mmst) | [doc](https://instadeepai.github.io/jumanji/environments/mmst/) |
| ᗧ•••ᗣ•• PacMan | Routing | `PacMan-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pac_man/) | [doc](https://instadeepai.github.io/jumanji/environments/pac_man/)
| 👾 Sokoban | Routing | `Sokoban-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/sokoban/) | [doc](https://instadeepai.github.io/jumanji/environments/sokoban/) |
| 🍎 Level-Based Foraging | Routing | `LevelBasedForaging-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/lbf/) | [doc](https://instadeepai.github.io/jumanji/environments/lbf/) |

<h2 name="install" id="install">Installation 🎬</h2>

Expand Down
9 changes: 9 additions & 0 deletions docs/api/environments/lbf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
::: jumanji.environments.routing.lbf.env.LevelBasedForaging
selection:
members:
- __init__
- reset
- step
- observation_spec
- action_spec
- render
Binary file added docs/env_anim/lbf.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
43 changes: 43 additions & 0 deletions docs/environments/lbf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# # Level-Based Foraging Environment

<p align="center">
<img src="../env_anim/lbf.gif" width="600"/>
</p>

We provide a JAX jit-able implementation of the [Level-Based Foraging](https://github.com/semitable/lb-foraging/tree/master)
environment.

The Level-Based Foraging (LBF) represents a mixed cooperative-competitive environment that emphasises coordination between agents. As illustrated above, agents are placed within a grid world and assigned different levels.

To collect food, agents must be adjacent to it and the cumulative level of participating agents must meet or exceed the food's designated level. Agents receive points based on the level of the collected food and their own level.

## Observation

The **observation** seen by the agent is a `NamedTuple` containing the following:

- `agents_view`: jax array (int32) of shape `(num_agents, num_obs_features)`, array representing the agent's view of other agents
and food.

- `action_mask`: jax array (bool) of shape `(num_agents, 6)`, array specifying, for each agent,
which action (noop, up, down, left, right, load) is legal.

- `step_count`: jax array (int32) of shape `()`, number of steps elapsed in the current episode.

## Action

The action space is a `MultiDiscreteArray` containing an integer value in `[0, 1, 2, 3, 4, 5]` for each
agent. Each agent can take one of five actions: noop (`0`), up (`1`), down (`2`), turn left (`3`), turn right (`4`), or pick up food (`5`).

The episode terminates under the following conditions:

- An invalid action is taken, or

- An agent collides with another agent.

## Reward

The reward is equal to the sum of the levels of collected food divided by the level of the agents that collected them.

## Registered Versions 📖

- `LevelBasedForaging-v0`, a grid with 2 agents each with a field of view equal to the grid size (full observation case), with 2 food items and forcing the cooperation between agents.
6 changes: 6 additions & 0 deletions jumanji/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,3 +140,9 @@
register(
id="SlidingTilePuzzle-v0", entry_point="jumanji.environments:SlidingTilePuzzle"
)

# LevelBasedForaging with a random generator with 8 grid size,
# 2 agents and 2 food items and the maximum agent's level is 2.
register(
id="LevelBasedForaging-v0", entry_point="jumanji.environments:LevelBasedForaging"
)
1 change: 1 addition & 0 deletions jumanji/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
from jumanji.environments.routing.cleaner.env import Cleaner
from jumanji.environments.routing.connector.env import Connector
from jumanji.environments.routing.cvrp.env import CVRP
from jumanji.environments.routing.lbf.env import LevelBasedForaging
from jumanji.environments.routing.maze.env import Maze
from jumanji.environments.routing.mmst.env import MMST
from jumanji.environments.routing.multi_cvrp.env import MultiCVRP
Expand Down
26 changes: 12 additions & 14 deletions jumanji/environments/packing/bin_pack/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,20 +86,18 @@ class BinPack(Environment[State, specs.MultiDiscreteArray, Observation]):
already packed.
- state: `State`
- coordinates: jax array (float) of shape (num_nodes + 1, 2)
the coordinates of each node and the depot.
- demands: jax array (int32) of shape (num_nodes + 1,)
the associated cost of each node and the depot (0.0 for the depot).
- position: jax array (int32)
the index of the last visited node.
- capacity: jax array (int32)
the current capacity of the vehicle.
- visited_mask: jax array (bool) of shape (num_nodes + 1,)
binary mask (False/True <--> not visited/visited).
- trajectory: jax array (int32) of shape (2 * num_nodes,)
identifiers of the nodes that have been visited (set to DEPOT_IDX if not filled yet).
- num_visits: int32
number of actions that have been taken (i.e., unique visits).
- container: space defined by 2 points, i.e. 6 coordinates.
- ems: empty maximal spaces (EMSs) in the container, each defined by 2 points
(6 coordinates).
- ems_mask: array of booleans that indicate the EMSs that are valid.
- items: defined by 3 attributes (x, y, z).
- items_mask: array of booleans that indicate the items that can be packed.
- items_placed: array of booleans that indicate the items that have been placed so far.
- items_location: locations of items in the container, defined by 3 coordinates (x, y, x).
- action_mask: array of booleans that indicate the valid actions,
i.e. EMSs and items that can be chosen.
- sorted_ems_indexes: EMS indexes that are sorted by decreasing volume order.
- key: random key used for auto-reset.
```python
from jumanji.environments import BinPack
Expand Down
2 changes: 1 addition & 1 deletion jumanji/environments/routing/connector/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ class Connector(Environment[State, specs.MultiDiscreteArray, Observation]):
key = jax.random.PRNGKey(0)
state, timestep = jax.jit(env.reset)(key)
env.render(state)
action = env.action_specc.generate_value()
action = env.action_spec.generate_value()
state, timestep = jax.jit(env.step)(state, action)
env.render(state)
```
Expand Down
17 changes: 17 additions & 0 deletions jumanji/environments/routing/lbf/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from jumanji.environments.routing.lbf.env import LevelBasedForaging
from jumanji.environments.routing.lbf.observer import GridObserver, VectorObserver
from jumanji.environments.routing.lbf.types import Agent, Food, Observation, State
205 changes: 205 additions & 0 deletions jumanji/environments/routing/lbf/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import chex
import jax
import jax.numpy as jnp
import pytest

from jumanji.environments.routing.lbf.env import LevelBasedForaging
from jumanji.environments.routing.lbf.generator import RandomGenerator
from jumanji.environments.routing.lbf.types import Agent, Food, State
from jumanji.tree_utils import tree_transpose

# create food and agents for grid that looks like:
# "AGENT" | EMPTY | EMPTY | EMPTY | EMPTY | EMPTY
# EMPTY | "AGENT" | EMPTY | EMPTY | EMPTY | EMPTY
# EMPTY | "FOOD" | "AGENT" | "FOOD" | EMPTY | EMPTY
# EMPTY | EMPTY | EMPTY | EMPTY | EMPTY | EMPTY
# EMPTY | EMPTY | "FOOD" | EMPTY | EMPTY | EMPTY
# EMPTY | EMPTY | EMPTY | EMPTY | EMPTY | EMPTY


@pytest.fixture
def key() -> chex.PRNGKey:
return jax.random.PRNGKey(42)


@pytest.fixture
def agent0() -> Agent:
return Agent(
id=jnp.asarray(0),
position=jnp.array([0, 0]),
level=jnp.asarray(1),
loading=jnp.asarray(False),
)


@pytest.fixture
def agent1() -> Agent:
return Agent(
id=jnp.asarray(1),
position=jnp.array([1, 1]),
level=jnp.asarray(2),
loading=jnp.asarray(False),
)


@pytest.fixture
def agent2() -> Agent:
return Agent(
id=jnp.asarray(2),
position=jnp.array([2, 2]),
level=jnp.asarray(4),
loading=jnp.asarray(False),
)


@pytest.fixture
def food0() -> Food:
return Food(
id=jnp.asarray(0),
position=jnp.array([2, 1]),
level=jnp.asarray(4),
eaten=jnp.asarray(False),
)


@pytest.fixture
def food1() -> Food:
return Food(
id=jnp.asarray(1),
position=jnp.array([2, 3]),
level=jnp.asarray(4),
eaten=jnp.asarray(False),
)


@pytest.fixture
def food2() -> Food:
return Food(
id=jnp.asarray(1),
position=jnp.array([4, 2]),
level=jnp.asarray(3),
eaten=jnp.asarray(False),
)


@pytest.fixture
def agents(agent0: Agent, agent1: Agent, agent2: Agent) -> Agent:
return tree_transpose([agent0, agent1, agent2])


@pytest.fixture
def food_items(food0: Food, food1: Food, food2: Food) -> Food:
return tree_transpose([food0, food1, food2])


@pytest.fixture
def state(agents: Agent, food_items: Food, key: chex.PRNGKey) -> State:
return State(agents=agents, food_items=food_items, step_count=0, key=key)


@pytest.fixture
def agent_grid() -> chex.Array:
"""Returns the agents' levels in their postion on the grid."""
return jnp.array(
[
[1, 0, 0, 0, 0, 0],
[0, 2, 0, 0, 0, 0],
[0, 0, 4, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
]
)


@pytest.fixture
def food_grid() -> chex.Array:
"""Returns the food items's levels in their postion on the grid."""
return jnp.array(
[
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 4, 0, 4, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
]
)


@pytest.fixture
def random_generator() -> RandomGenerator:
return RandomGenerator(
grid_size=8,
fov=2,
num_agents=2,
num_food=2,
max_agent_level=2,
force_coop=True,
)


@pytest.fixture
def lbf_environment() -> LevelBasedForaging:
generator = RandomGenerator(
grid_size=8,
fov=6,
num_agents=3,
num_food=3,
max_agent_level=4,
force_coop=True,
)

return LevelBasedForaging(generator=generator, time_limit=5)


@pytest.fixture
def lbf_env_2s() -> LevelBasedForaging:
generator = RandomGenerator(
grid_size=8,
fov=2,
num_agents=2,
num_food=2,
max_agent_level=2,
force_coop=False,
)

return LevelBasedForaging(generator=generator, time_limit=5)


@pytest.fixture
def lbf_env_grid_obs() -> LevelBasedForaging:
generator = RandomGenerator(
grid_size=8,
fov=6,
num_agents=3,
num_food=3,
max_agent_level=4,
force_coop=True,
)

return LevelBasedForaging(generator=generator, grid_observation=True)


@pytest.fixture
def lbf_with_penalty() -> LevelBasedForaging:
return LevelBasedForaging(penalty=1.0)


@pytest.fixture
def lbf_with_no_norm_reward() -> LevelBasedForaging:
return LevelBasedForaging(normalize_reward=False)
32 changes: 32 additions & 0 deletions jumanji/environments/routing/lbf/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import jax.numpy as jnp

# Actions
NOOP = 0
UP = 1
DOWN = 2
LEFT = 3
RIGHT = 4
LOAD = 5

# NOOP, UP, DOWN, LEFT, RIGHT, LOAD
MOVES = jnp.array([[0, 0], [-1, 0], [1, 0], [0, -1], [0, 1], [0, 0]])

# viewer constants
_FIGURE_SIZE = (5, 5)

# Define some colors for visualization.
_GRID_COLOR = (0, 0, 0) # black
_LINE_COLOR = (1, 1, 1) # white
Loading

0 comments on commit 659b07c

Please sign in to comment.