MaMuJoCo (Multi-Agent MuJoCo)#

MaMuJoCo was introduced in “FACMAC: Factored Multi-Agent Centralised Policy Gradients”.
There are 2 types of Environments, included (1) multi-agent factorizations of Gymansium/MuJoCo tasks and (2) new complex MuJoCo tasks meant to me solved with multi-agent Algorithms.
Gymansium-Robotics/MaMuJoCo Represents the first, easy to use Framework for research of agent factorization.
API#
MaMuJoCo mainly uses the PettingZoo.ParallelAPI, but also supports a few extra functions:
- gymnasium_robotics.mamujoco_v0.parallel_env.map_local_actions_to_global_action(self, actions: dict[str, ndarray]) ndarray #
Maps multi agent actions into single agent action space.
- Parameters:
action – An dict representing the action of each agent
- Returns:
The action of the whole domain (is what eqivilent single agent action would be)
- Raises:
AssertionError – If the Agent action factorization is badly defined (if an action is double defined or not defined at all)
- gymnasium_robotics.mamujoco_v0.parallel_env.map_global_action_to_local_actions(self, action: ndarray) dict[str, ndarray] #
Maps single agent action into multi agent action spaces.
- Parameters:
action – An array representing the actions of the single agent for this domain
- Returns:
A dictionary of actions to be performed by each agent
- Raises:
AssertionError – If the Agent action factorization sizes are badly defined
- gymnasium_robotics.mamujoco_v0.parallel_env.map_global_state_to_local_observations(self, global_state: ndarray) dict[str, ndarray] #
Maps single agent observation into multi agent observation spaces.
- Parameters:
global_state – the global_state (generated from MaMuJoCo.state())
- Returns:
A dictionary of states that would be observed by each agent given the ‘global_state’
- gymnasium_robotics.mamujoco_v0.parallel_env.map_local_observation_to_global_state(self, local_observations: dict[str, ndarray]) ndarray #
Maps multi agent observations into single agent observation space.
NOT IMPLEMENTED, try using MaMuJoCo.state() instead
- Parameters:
local_obserations – the local observation of each agents (generated from MaMuJoCo.step())
- Returns:
the global observations that correspond to a single agent (what you would get with MaMuJoCo.state())
- gymnasium_robotics.mamujoco_v0.get_parts_and_edges(label: str, partitioning: str | None) tuple[list[tuple[Node, ...]], list[HyperEdge], list[Node]] #
Gets the mujoco Graph (nodes & edges) given an optional partitioning,.
- Parameters:
label – the mujoco task to partition
partitioning – the partioneing scheme
- Returns:
the partition of the mujoco graph nodes, the graph edges, and global nodes
MaMuJoCo also supports the PettingZoo.AECAPI but does not expose extra functions.
Arguments#
- gymnasium_robotics.mamujoco_v0.parallel_env.__init__(self, scenario: str, agent_conf: str | None, agent_obsk: int | None = 1, agent_factorization: dict | None = None, local_categories: list[list[str]] | None = None, global_categories: tuple[str, ...] | None = None, render_mode: str | None = None, **kwargs)#
Init.
- Parameters:
scenario – The Task/Environment, valid values: “Ant”, “HalfCheetah”, “Hopper”, “HumanoidStandup”, “Humanoid”, “Reacher”, “Swimmer”, “Pusher”, “Walker2d”, “InvertedPendulum”, “InvertedDoublePendulum”, “ManySegmentSwimmer”, “ManySegmentAnt”, “CoupledHalfCheetah”
agent_conf – ‘${Number Of Agents}x${Number Of Segments per Agent}${Optionally Additional options}’, eg ‘1x6’, ‘2x4’, ‘2x4d’, If it set to None the task becomes single agent (the agent observes the entire environment, and performs all the actions)
agent_obsk – Number of nearest joints to observe, If set to 0 it only observes local state, If set to 1 it observes local state + 1 joint over, If set to 2 it observes local state + 2 joints over, If it set to None the task becomes single agent (the agent observes the entire environment, and performs all the actions) The Default value is: 1
agent_factorization – A custom factorization of the MuJoCo environment (overwrites agent_conf), see DOC [how to create new agent factorizations](https://robotics.farama.org/envs/MaMuJoCo/index.html#how-to-create-new-agent-factorizations).
local_categories – The categories of local observations for each observation depth, It takes the form of a list where the k-th element is the list of observable items observable at the k-th depth For example: if it is set to [[“qpos, qvel”], [“qvel”]] then means each agent observes its own position and velocity elements, and it’s neighbors velocity elements. The default is: Check each environment’s page on the “observation space” section.
global_categories – The categories of observations extracted from the global observable space, For example: if it is set to (“qpos”) out of the globally observable items of the environment, only the position items will be observed. The default is: Check each environment’s page on the “observation space” section.
render_mode – see [Gymansium/MuJoCo](https://gymnasium.farama.org/environments/mujoco/), valid values: ‘human’, ‘rgb_array’, ‘depth_array’
kwargs – Additional arguments passed to the [Gymansium/MuJoCo](https://gymnasium.farama.org/environments/mujoco/) environment, Note: arguments that change the observation space will not work.
Raises – NotImplementedError: When the scenario is not supported (not part of of the valid values)
How to create new agent factorizations#
example ‘Ant-v4’, ‘8x1’#
In this example, we will create an agent factorization not present in Gymnasium-Robotics/MaMuJoCo the “Ant”/’8x1’, where each agent controls a single joint/action (first implemented by safe-MaMuJoCo).
first we will load the graph of MaMuJoCo:
>>> from gymnasium_robotics.mamujoco_v0 import get_parts_and_edges
>>> unpartioned_nodes, edges, global_nodes = get_parts_and_edges('Ant-v4', None)
The unpartioned_nodes
contain the nodes of the MaMuJoCo graph.
The edges
well, contain the edges of the graph.
And the global_nodes
a set of observations for all agents.
To create our ‘8x1’ partition we will need to partition the unpartioned_nodes
:
>>> unpartioned_nodes
[(hip1, ankle1, hip2, ankle2, hip3, ankle3, hip4, ankle4)]
>>> partioned_nodes = [(unpartioned_nodes[0][0],), (unpartioned_nodes[0][1],), (unpartioned_nodes[0][2],), (unpartioned_nodes[0][3],), (unpartioned_nodes[0][4],), (unpartioned_nodes[0][5],), (unpartioned_nodes[0][6],), (unpartioned_nodes[0][7],)]>>> partioned_nodes
>>> partioned_nodes
[(hip1,), (ankle1,), (hip2,), (ankle2,), (hip3,), (ankle3,), (hip4,), (ankle4,)]
Finally package the partitions and create our environment:
>>> my_agent_factorization = {"partition": partioned_nodes, "edges": edges, "globals": global_nodes}
>>> gym_env = mamujoco_v0('Ant', '8x1', agent_factorization=my_agent_factorization)
Version History#
v0: Initial version release, uses Gymnasium.MuJoCo-v4, and is a fork of the original multiagent_mujuco