Franka Kitchen¶

Description¶

This environment was introduced in “Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning” by Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman.

The environment is based on the 9 degrees of freedom Franka robot. The Franka robot is placed in a kitchen environment containing several common household items: a microwave, a kettle, an overhead light, cabinets, and an oven. The environment is a multitask goal in which the robot has to interact with the previously mentioned items in order to reach a desired goal configuration. For example, one such state is to have the microwave and sliding cabinet door open with the kettle on the top burner and the overhead light on. The goal tasks can be configured when the environment is created.

Goal¶

The goal has a multitask configuration. The multiple tasks to be completed in an episode can be set by passing a list of tasks to the argumenttasks_to_complete. For example, to open the microwave door and move the kettle create the environment as follows:

import gymnasium as gym
import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make('FrankaKitchen-v1', tasks_to_complete=['microwave', 'kettle'])

The following is a table with all the possible tasks and their respective joint goal values:

Task	Description	Joint Type	Goal
“bottom burner”	Turn the oven knob that activates the bottom burner	slide	[-0.88, -0.01]
“top burner”	Turn the oven knob that activates the top burner	slide	[-0.92, -0.01]
“light switch”	Turn on the light switch	slide	[-0.69, -0.05]
“slide cabinet”	Open the slide cabinet	slide	0.37
“hinge cabinet”	Open the left hinge cabinet	hinge	[0.0, 1.45]
“microwave”	Open the microwave door	hinge	0.37
“kettle”	Move the kettle to the top left burner	free	[-0.23, 0.75, 1.62, 0.99, 0., 0., -0.06]

Action Space¶

The default joint actuators in the Franka MuJoCo model are position controlled. However, the action space of the environment are joint velocities clipped between -1 and 1 rad/s. The space is a Box(-1.0, 1.0, (9,), float32). The desired joint position control input is estimated in each time step with the current joint position values and the desired velocity action:

Num	Action	Action Min	Action Max	Joint	Unit
0	`robot:panda0_joint1` angular velocity	-1	1	hinge	rad/s
1	`robot:panda0_joint2` angular velocity	-1	1	hinge	rad/s
2	`robot:panda0_joint3` angular velocity	-1	1	hinge	rad/s
3	`robot:panda0_joint4` angular velocity	-1	1	hinge	rad/s
4	`robot:panda0_joint5` angular velocity	-1	1	hinge	rad/s
5	`robot:panda0_joint6` angular velocity	-1	1	hinge	rad/s
6	`robot:panda0_joint7` angular velocity	-1	1	hinge	rad/s
7	`robot:r_gripper_finger_joint` linear velocity	-1	1	slide	m/s
8	`robot:l_gripper_finger_joint` linear velocity	-1	1	slide	m/s

Observation Space¶

The observation is a goal-aware observation space. The observation space contains the following keys:

observation: this is a Box(-inf, inf, shape=(59,), dtype="float64") space and it is formed by the robot’s joint positions and velocities, as well as the pose and velocities of the kitchen items. An additional uniform noise of range [-1,1] is added to the observations. The noise is also scaled by a factor of robot_noise_ratio and object_noise_ratio given in the environment arguments. The elements of the observation array are the following:

Num	Observation	Min	Max	Joint Name (in corresponding XML file)	Joint Type	Unit
0	`robot:panda0_joint1` hinge joint angle value	-Inf	Inf	robot:panda0_joint1	hinge	angle (rad)
1	`robot:panda0_joint2` hinge joint angle value	-Inf	Inf	robot:panda0_joint2	hinge	angle (rad)
2	`robot:panda0_joint3` hinge joint angle value	-Inf	Inf	robot:panda0_joint3	hinge	angle (rad)
3	`robot:panda0_joint4` hinge joint angle value	-Inf	Inf	robot:panda0_joint4	hinge	angle (rad)
4	`robot:panda0_joint5` hinge joint angle value	-Inf	Inf	robot:panda0_joint5	hinge	angle (rad)
5	`robot:panda0_joint6` hinge joint angle value	-Inf	Inf	robot:panda0_joint6	hinge	angle (rad)
6	`robot:panda0_joint7` hinge joint angle value	-Inf	Inf	robot:panda0_joint7	hinge	angle (rad)
7	`robot:r_gripper_finger_joint` slide joint translation value	-Inf	Inf	robot:r_gripper_finger_joint	slide	position (m)
8	`robot:l_gripper_finger_joint` slide joint translation value	-Inf	Inf	robot:l_gripper_finger_joint	slide	position (m)
9	`robot:panda0_joint1` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint1	hinge	angular velocity (rad/s)
10	`robot:panda0_joint2` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint2	hinge	angular velocity (rad/s)
11	`robot:panda0_joint3` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint3	hinge	angular velocity (rad/s)
12	`robot:panda0_joint4` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint4	hinge	angular velocity (rad/s)
13	`robot:panda0_joint5` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint5	hinge	angular velocity (rad/s)
14	`robot:panda0_joint6` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint6	hinge	angular velocity (rad/s)
15	`robot:panda0_joint7` hinge joint angular velocity	-Inf	Inf	robot:panda0_joint7	hinge	angle (rad)
16	`robot:r_gripper_finger_joint` slide joint linear velocity	-Inf	Inf	robot:r_gripper_finger_joint	slide	linear velocity (m/s)
17	`robot:l_gripper_finger_joint` slide joint linear velocity	-Inf	Inf	robot:l_gripper_finger_joint	slide	linear velocity (m/s)
18	Rotation of the knob for the bottom right burner	-Inf	Inf	knob_Joint_1	hinge	angle (rad)
19	Joint opening of the bottom right burner	-Inf	Inf	bottom_right_burner	slide	position (m)
20	Rotation of the knob for the bottom left burner	-Inf	Inf	knob_Joint_2	hinge	angle (rad)
21	Joint opening of the bottom left burner	-Inf	Inf	bottom_left_burner	slide	position (m)
22	Rotation of the knob for the top right burner	-Inf	Inf	knob_Joint_3	hinge	angle (rad)
23	Joint opening of the top right burner	-Inf	Inf	top_right_burner	slide	position (m)
24	Rotation of the knob for the top left burner	-Inf	Inf	knob_Joint_4	hinge	angle (rad)
25	Joint opening of the top left burner	-Inf	Inf	top_left_burner	slide	position (m)
26	Joint angle value of the overhead light switch	-Inf	Inf	light_switch	slide	position (m)
27	Opening of the overhead light joint	-Inf	Inf	light_joint	hinge	angle (rad)
28	Translation of the slide cabinet joint	-Inf	Inf	slide_cabinet	slide	position (m)
29	Rotation of the joint in the left hinge cabinet	-Inf	Inf	left_hinge_cabinet	hinge	angle (rad)
30	Rotation of the joint in the right hinge cabinet	-Inf	Inf	right_hinge_cabinet	hinge	angle (rad)
31	Rotation of the joint in the microwave door	-Inf	Inf	microwave	hinge	angle (rad)
32	Kettle’s x coordinate	-Inf	Inf	kettle	free	position (m)
33	Kettle’s y coordinate	-Inf	Inf	kettle	free	position (m)
34	Kettle’s z coordinate	-Inf	Inf	kettle	free	position (m)
35	Kettle’s x quaternion rotation	-Inf	Inf	kettle	free	-
36	Kettle’s y quaternion rotation	-Inf	Inf	kettle	free	-
37	Kettle’s z quaternion rotation	-Inf	Inf	kettle	free	-
38	Kettle’s w quaternion rotation	-Inf	Inf	kettle	free	-
39	Bottom right burner knob angular velocity	-Inf	Inf	knob_Joint_1	hinge	angular velocity (rad/s)
40	Opening linear velocity of the bottom right burner	-Inf	Inf	bottom_right_burner	slide	velocity (m/s)
41	Bottom left burner knob angular velocity	-Inf	Inf	knob_Joint_2	hinge	angular velocity (rad/s)
42	Opening linear velocity of the bottom left burner	-Inf	Inf	bottom_left_burner	slide	velocity (m/s)
43	Top right burner knob angular velocity	-Inf	Inf	knob_Joint_3	hinge	angular velocity (rad/s)
44	Opening linear velocity of the top right burner	-Inf	Inf	top_right_burner	slide	velocity (m/s)
45	Top left burner knob angular velocity	-Inf	Inf	knob_Joint_4	hinge	angular velocity (rad/s)
46	Opening linear velocity of the top left burner	-Inf	Inf	top_left_burner	slide	velocity (m/s)
47	Angular velocity of the overhead light switch	-Inf	Inf	light_switch	slide	velocity (m/s)
48	Opening linear velocity of the overhead light	-Inf	Inf	light_joint	hinge	angular velocity (rad/s)
49	Linear velocity of the slide cabinet joint	-Inf	Inf	slide_cabinet	slide	velocity (m/s)
50	Angular velocity of the left hinge cabinet joint	-Inf	Inf	left_hinge_cabinet	hinge	angular velocity (rad/s)
51	Angular velocity of the right hinge cabinet joint	-Inf	Inf	right_hinge_cabinet	hinge	angular velocity (rad/s)
52	Anular velocity of the microwave door joint	-Inf	Inf	microwave	hinge	angular velocity (rad/s)
53	Kettle’s x linear velocity	-Inf	Inf	kettle	free	linear velocity (m/s)
54	Kettle’s y linear velocity	-Inf	Inf	kettle	free	linear velocity (m/s)
55	Kettle’s z linear velocity	-Inf	Inf	kettle	free	linear velocity (m/s)
56	Kettle’s x axis angular rotation	-Inf	Inf	kettle	free	angular velocity(rad/s)
57	Kettle’s y axis angular rotation	-Inf	Inf	kettle	free	angular velocity(rad/s)
58	Kettle’s z axis angular rotation	-Inf	Inf	kettle	free	angular velocity(rad/s)

desired_goal: this key represents the final goal to be achieved. The value is another Dict space with keys the tasks to be completed in the episode and values the joint goal configuration of each joint in the task as specified in the Goal section.
achieved_goal: this key represents the current state of the tasks. The value is another Dict space with keys the tasks to be completed in the episode and values the current joint configuration of each joint in the task.

Info¶

The environment also returns an info dictionary in each Gymnasium step. The keys are:

tasks_to_complete (list[str]): list of tasks that haven’t yet been completed in the current episode.
step_task_completions (list[str]): list of tasks completed in the step taken.
episode_task_completions (list[str]): list of tasks completed during the episode uptil the current step.

Rewards¶

The environment’s reward is sparse. The reward in each Gymnasium step is equal to the number of task completed in the given step. If no task is completed the returned reward will be zero. The tasks are considered completed when their joint configuration is within a norm threshold of 0.3 with respect to the goal configuration specified in the Goal section.

Starting State¶

The simulation starts with all of the joint position actuators of the Franka robot set to zero. The doors of the microwave and cabinets are closed, the burners turned off, and the light switch also off. The kettle will be placed in the bottom left burner.

Episode End¶

The episode will be truncated when the duration reaches a total of max_episode_steps which by default is set to 280 timesteps. The episode is terminated when all the tasks have been completed unless the terminate_on_tasks_completed argument is set to False.

Arguments¶

The following arguments can be passed when initializing the environment with gymnasium.make kwargs:

Parameter	Type	Default	Description
`tasks_to_complete`	list[str]	All possible goal tasks. Go to Goal section	The goal tasks to reach in each episode
`terminate_on_tasks_completed`	bool	`True`	Terminate episode if no more tasks to complete (episodic multitask)
`remove_task_when_completed`	bool	`True`	Remove the completed tasks from the info dictionary returned after each step
`object_noise_ratio`	float	`0.0005`	Scaling factor applied to the uniform noise added to the kitchen object observations
`robot_noise_ratio`	float	`0.01`	Scaling factor applied to the uniform noise added to the robot joint observations
`max_episode_steps`	integer	`280`	Maximum number of steps per episode

Version History¶

v1: updated version with most recent python MuJoCo bindings.
v0: legacy versions in the D4RL.