Franka Kitchen¶

Description¶
This environment was introduced in “Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning” by Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman.
The environment is based on the 9 degrees of freedom Franka robot. The Franka robot is placed in a kitchen environment containing several common
household items: a microwave, a kettle, an overhead light, cabinets, and an oven. The environment is a multitask
goal in which the robot has to interact with the previously
mentioned items in order to reach a desired goal configuration. For example, one such state is to have the microwave and sliding cabinet door open with the kettle on the top burner
and the overhead light on. The goal tasks can be configured when the environment is created.
Goal¶
The goal has a multitask configuration. The multiple tasks to be completed in an episode can be set by passing a list of tasks to the argumenttasks_to_complete
. For example, to open
the microwave door and move the kettle create the environment as follows:
import gymnasium as gym
import gymnasium_robotics
gym.register_envs(gymnasium_robotics)
env = gym.make('FrankaKitchen-v1', tasks_to_complete=['microwave', 'kettle'])
The following is a table with all the possible tasks and their respective joint goal values:
Task |
Description |
Joint Type |
Goal |
---|---|---|---|
“bottom burner” |
Turn the oven knob that activates the bottom burner |
slide |
[-0.88, -0.01] |
“top burner” |
Turn the oven knob that activates the top burner |
slide |
[-0.92, -0.01] |
“light switch” |
Turn on the light switch |
slide |
[-0.69, -0.05] |
“slide cabinet” |
Open the slide cabinet |
slide |
0.37 |
“hinge cabinet” |
Open the left hinge cabinet |
hinge |
[0.0, 1.45] |
“microwave” |
Open the microwave door |
hinge |
0.37 |
“kettle” |
Move the kettle to the top left burner |
free |
[-0.23, 0.75, 1.62, 0.99, 0., 0., -0.06] |
Action Space¶
The default joint actuators in the Franka MuJoCo model are position controlled. However, the action space of the environment are joint velocities clipped between -1 and 1 rad/s.
The space is a Box(-1.0, 1.0, (9,), float32)
. The desired joint position control input is estimated in each time step with the current joint position values and the desired velocity
action:
Num |
Action |
Action Min |
Action Max |
Joint |
Unit |
---|---|---|---|---|---|
0 |
|
-1 |
1 |
hinge |
rad/s |
1 |
|
-1 |
1 |
hinge |
rad/s |
2 |
|
-1 |
1 |
hinge |
rad/s |
3 |
|
-1 |
1 |
hinge |
rad/s |
4 |
|
-1 |
1 |
hinge |
rad/s |
5 |
|
-1 |
1 |
hinge |
rad/s |
6 |
|
-1 |
1 |
hinge |
rad/s |
7 |
|
-1 |
1 |
slide |
m/s |
8 |
|
-1 |
1 |
slide |
m/s |
Observation Space¶
The observation is a goal-aware
observation space. The observation space contains the following keys:
observation
: this is aBox(-inf, inf, shape=(59,), dtype="float64")
space and it is formed by the robot’s joint positions and velocities, as well as the pose and velocities of the kitchen items. An additional uniform noise of range[-1,1]
is added to the observations. The noise is also scaled by a factor ofrobot_noise_ratio
andobject_noise_ratio
given in the environment arguments. The elements of theobservation
array are the following:
Num |
Observation |
Min |
Max |
Joint Name (in corresponding XML file) |
Joint Type |
Unit |
---|---|---|---|---|---|---|
0 |
|
-Inf |
Inf |
robot:panda0_joint1 |
hinge |
angle (rad) |
1 |
|
-Inf |
Inf |
robot:panda0_joint2 |
hinge |
angle (rad) |
2 |
|
-Inf |
Inf |
robot:panda0_joint3 |
hinge |
angle (rad) |
3 |
|
-Inf |
Inf |
robot:panda0_joint4 |
hinge |
angle (rad) |
4 |
|
-Inf |
Inf |
robot:panda0_joint5 |
hinge |
angle (rad) |
5 |
|
-Inf |
Inf |
robot:panda0_joint6 |
hinge |
angle (rad) |
6 |
|
-Inf |
Inf |
robot:panda0_joint7 |
hinge |
angle (rad) |
7 |
|
-Inf |
Inf |
robot:r_gripper_finger_joint |
slide |
position (m) |
8 |
|
-Inf |
Inf |
robot:l_gripper_finger_joint |
slide |
position (m) |
9 |
|
-Inf |
Inf |
robot:panda0_joint1 |
hinge |
angular velocity (rad/s) |
10 |
|
-Inf |
Inf |
robot:panda0_joint2 |
hinge |
angular velocity (rad/s) |
11 |
|
-Inf |
Inf |
robot:panda0_joint3 |
hinge |
angular velocity (rad/s) |
12 |
|
-Inf |
Inf |
robot:panda0_joint4 |
hinge |
angular velocity (rad/s) |
13 |
|
-Inf |
Inf |
robot:panda0_joint5 |
hinge |
angular velocity (rad/s) |
14 |
|
-Inf |
Inf |
robot:panda0_joint6 |
hinge |
angular velocity (rad/s) |
15 |
|
-Inf |
Inf |
robot:panda0_joint7 |
hinge |
angle (rad) |
16 |
|
-Inf |
Inf |
robot:r_gripper_finger_joint |
slide |
linear velocity (m/s) |
17 |
|
-Inf |
Inf |
robot:l_gripper_finger_joint |
slide |
linear velocity (m/s) |
18 |
Rotation of the knob for the bottom right burner |
-Inf |
Inf |
knob_Joint_1 |
hinge |
angle (rad) |
19 |
Joint opening of the bottom right burner |
-Inf |
Inf |
bottom_right_burner |
slide |
position (m) |
20 |
Rotation of the knob for the bottom left burner |
-Inf |
Inf |
knob_Joint_2 |
hinge |
angle (rad) |
21 |
Joint opening of the bottom left burner |
-Inf |
Inf |
bottom_left_burner |
slide |
position (m) |
22 |
Rotation of the knob for the top right burner |
-Inf |
Inf |
knob_Joint_3 |
hinge |
angle (rad) |
23 |
Joint opening of the top right burner |
-Inf |
Inf |
top_right_burner |
slide |
position (m) |
24 |
Rotation of the knob for the top left burner |
-Inf |
Inf |
knob_Joint_4 |
hinge |
angle (rad) |
25 |
Joint opening of the top left burner |
-Inf |
Inf |
top_left_burner |
slide |
position (m) |
26 |
Joint angle value of the overhead light switch |
-Inf |
Inf |
light_switch |
slide |
position (m) |
27 |
Opening of the overhead light joint |
-Inf |
Inf |
light_joint |
hinge |
angle (rad) |
28 |
Translation of the slide cabinet joint |
-Inf |
Inf |
slide_cabinet |
slide |
position (m) |
29 |
Rotation of the joint in the left hinge cabinet |
-Inf |
Inf |
left_hinge_cabinet |
hinge |
angle (rad) |
30 |
Rotation of the joint in the right hinge cabinet |
-Inf |
Inf |
right_hinge_cabinet |
hinge |
angle (rad) |
31 |
Rotation of the joint in the microwave door |
-Inf |
Inf |
microwave |
hinge |
angle (rad) |
32 |
Kettle’s x coordinate |
-Inf |
Inf |
kettle |
free |
position (m) |
33 |
Kettle’s y coordinate |
-Inf |
Inf |
kettle |
free |
position (m) |
34 |
Kettle’s z coordinate |
-Inf |
Inf |
kettle |
free |
position (m) |
35 |
Kettle’s x quaternion rotation |
-Inf |
Inf |
kettle |
free |
- |
36 |
Kettle’s y quaternion rotation |
-Inf |
Inf |
kettle |
free |
- |
37 |
Kettle’s z quaternion rotation |
-Inf |
Inf |
kettle |
free |
- |
38 |
Kettle’s w quaternion rotation |
-Inf |
Inf |
kettle |
free |
- |
39 |
Bottom right burner knob angular velocity |
-Inf |
Inf |
knob_Joint_1 |
hinge |
angular velocity (rad/s) |
40 |
Opening linear velocity of the bottom right burner |
-Inf |
Inf |
bottom_right_burner |
slide |
velocity (m/s) |
41 |
Bottom left burner knob angular velocity |
-Inf |
Inf |
knob_Joint_2 |
hinge |
angular velocity (rad/s) |
42 |
Opening linear velocity of the bottom left burner |
-Inf |
Inf |
bottom_left_burner |
slide |
velocity (m/s) |
43 |
Top right burner knob angular velocity |
-Inf |
Inf |
knob_Joint_3 |
hinge |
angular velocity (rad/s) |
44 |
Opening linear velocity of the top right burner |
-Inf |
Inf |
top_right_burner |
slide |
velocity (m/s) |
45 |
Top left burner knob angular velocity |
-Inf |
Inf |
knob_Joint_4 |
hinge |
angular velocity (rad/s) |
46 |
Opening linear velocity of the top left burner |
-Inf |
Inf |
top_left_burner |
slide |
velocity (m/s) |
47 |
Angular velocity of the overhead light switch |
-Inf |
Inf |
light_switch |
slide |
velocity (m/s) |
48 |
Opening linear velocity of the overhead light |
-Inf |
Inf |
light_joint |
hinge |
angular velocity (rad/s) |
49 |
Linear velocity of the slide cabinet joint |
-Inf |
Inf |
slide_cabinet |
slide |
velocity (m/s) |
50 |
Angular velocity of the left hinge cabinet joint |
-Inf |
Inf |
left_hinge_cabinet |
hinge |
angular velocity (rad/s) |
51 |
Angular velocity of the right hinge cabinet joint |
-Inf |
Inf |
right_hinge_cabinet |
hinge |
angular velocity (rad/s) |
52 |
Anular velocity of the microwave door joint |
-Inf |
Inf |
microwave |
hinge |
angular velocity (rad/s) |
53 |
Kettle’s x linear velocity |
-Inf |
Inf |
kettle |
free |
linear velocity (m/s) |
54 |
Kettle’s y linear velocity |
-Inf |
Inf |
kettle |
free |
linear velocity (m/s) |
55 |
Kettle’s z linear velocity |
-Inf |
Inf |
kettle |
free |
linear velocity (m/s) |
56 |
Kettle’s x axis angular rotation |
-Inf |
Inf |
kettle |
free |
angular velocity(rad/s) |
57 |
Kettle’s y axis angular rotation |
-Inf |
Inf |
kettle |
free |
angular velocity(rad/s) |
58 |
Kettle’s z axis angular rotation |
-Inf |
Inf |
kettle |
free |
angular velocity(rad/s) |
desired_goal
: this key represents the final goal to be achieved. The value is anotherDict
space with keys the tasks to be completed in the episode and values the joint goal configuration of each joint in the task as specified in theGoal
section.achieved_goal
: this key represents the current state of the tasks. The value is anotherDict
space with keys the tasks to be completed in the episode and values the current joint configuration of each joint in the task.
Info¶
The environment also returns an info
dictionary in each Gymnasium step. The keys are:
tasks_to_complete
(list[str]): list of tasks that haven’t yet been completed in the current episode.step_task_completions
(list[str]): list of tasks completed in the step taken.episode_task_completions
(list[str]): list of tasks completed during the episode uptil the current step.
Rewards¶
The environment’s reward is sparse
. The reward in each Gymnasium step is equal to the number of task completed in the given step. If no task is completed the returned reward will be zero.
The tasks are considered completed when their joint configuration is within a norm threshold of 0.3
with respect to the goal configuration specified in the Goal
section.
Starting State¶
The simulation starts with all of the joint position actuators of the Franka robot set to zero. The doors of the microwave and cabinets are closed, the burners turned off, and the light switch also off. The kettle will be placed in the bottom left burner.
Episode End¶
The episode will be truncated
when the duration reaches a total of max_episode_steps
which by default is set to 280 timesteps.
The episode is terminated
when all the tasks have been completed unless the terminate_on_tasks_completed
argument is set to False
.
Arguments¶
The following arguments can be passed when initializing the environment with gymnasium.make
kwargs:
Parameter |
Type |
Default |
Description |
---|---|---|---|
|
list[str] |
All possible goal tasks. Go to Goal section |
The goal tasks to reach in each episode |
|
bool |
|
Terminate episode if no more tasks to complete (episodic multitask) |
|
bool |
|
Remove the completed tasks from the info dictionary returned after each step |
|
float |
|
Scaling factor applied to the uniform noise added to the kitchen object observations |
|
float |
|
Scaling factor applied to the uniform noise added to the robot joint observations |
|
integer |
|
Maximum number of steps per episode |
Version History¶
v1: updated version with most recent python MuJoCo bindings.
v0: legacy versions in the D4RL.