Franka Kitchen#
Description#
This environment was introduced in “Relay policy learning: Solving longhorizon tasks via imitation and reinforcement learning” by Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman.
The environment is based on the 9 degrees of freedom Franka robot. The Franka robot is placed in a kitchen environment containing several common
household items: a microwave, a kettle, an overhead light, cabinets, and an oven. The environment is a multitask
goal in which the robot has to interact with the previously
mentioned items in order to reach a desired goal configuration. For example, one such state is to have the microwave and sliding cabinet door open with the kettle on the top burner
and the overhead light on. The goal tasks can be configured when the environment is created.
Goal#
The goal has a multitask configuration. The multiple tasks to be completed in an episode can be set by passing a list of tasks to the argumenttasks_to_complete
. For example, to open
the microwave door and move the kettle create the environment as follows:
import gymnasium as gym
env = gym.make('FrankaKitchenv1', tasks_to_complete=['microwave', 'kettle'])
The following is a table with all the possible tasks and their respective joint goal values:
Task 
Description 
Joint Type 
Goal 

“bottom burner” 
Turn the oven knob that activates the bottom burner 
slide 
[0.88, 0.01] 
“top burner” 
Turn the oven knob that activates the top burner 
slide 
[0.92, 0.01] 
“light switch” 
Turn on the light switch 
slide 
[0.69, 0.05] 
“slide cabinet” 
Open the slide cabinet 
slide 
0.37 
“hinge cabinet” 
Open the left hinge cabinet 
hinge 
[0.0, 1.45] 
“microwave” 
Open the microwave door 
hinge 
0.37 
“kettle” 
Move the kettle to the top left burner 
free 
[0.23, 0.75, 1.62, 0.99, 0., 0., 0.06] 
Action Space#
The default joint actuators in the Franka MuJoCo model are position controlled. However, the action space of the environment are joint velocities clipped between 1 and 1 rad/s.
The space is a Box(1.0, 1.0, (9,), float32)
. The desired joint position control input is estimated in each time step with the current joint position values and the desired velocity
action:
Num 
Action 
Action Min 
Action Max 
Joint 
Unit 

0 

1 
1 
hinge 
rad/s 
1 

1 
1 
hinge 
rad/s 
2 

1 
1 
hinge 
rad/s 
3 

1 
1 
hinge 
rad/s 
4 

1 
1 
hinge 
rad/s 
5 

1 
1 
hinge 
rad/s 
6 

1 
1 
hinge 
rad/s 
7 

1 
1 
slide 
m/s 
8 

1 
1 
slide 
m/s 
Observation Space#
The observation is a goalaware
observation space. The observation space contains the following keys:
observation
: this is aBox(inf, inf, shape=(59,), dtype="float64")
space and it is formed by the robot’s joint positions and velocities, as well as the pose and velocities of the kitchen items. An additional uniform noise of range[1,1]
is added to the observations. The noise is also scaled by a factor ofrobot_noise_ratio
andobject_noise_ratio
given in the environment arguments. The elements of theobservation
array are the following:
Num 
Observation 
Min 
Max 
Joint Name (in corresponding XML file) 
Joint Type 
Unit 

0 

Inf 
Inf 
robot:panda0_joint1 
hinge 
angle (rad) 
1 

Inf 
Inf 
robot:panda0_joint2 
hinge 
angle (rad) 
2 

Inf 
Inf 
robot:panda0_joint3 
hinge 
angle (rad) 
3 

Inf 
Inf 
robot:panda0_joint4 
hinge 
angle (rad) 
4 

Inf 
Inf 
robot:panda0_joint5 
hinge 
angle (rad) 
5 

Inf 
Inf 
robot:panda0_joint6 
hinge 
angle (rad) 
6 

Inf 
Inf 
robot:panda0_joint7 
hinge 
angle (rad) 
7 

Inf 
Inf 
robot:r_gripper_finger_joint 
slide 
position (m) 
8 

Inf 
Inf 
robot:l_gripper_finger_joint 
slide 
position (m) 
9 

Inf 
Inf 
robot:panda0_joint1 
hinge 
angular velocity (rad/s) 
10 

Inf 
Inf 
robot:panda0_joint2 
hinge 
angular velocity (rad/s) 
11 

Inf 
Inf 
robot:panda0_joint3 
hinge 
angular velocity (rad/s) 
12 

Inf 
Inf 
robot:panda0_joint4 
hinge 
angular velocity (rad/s) 
13 

Inf 
Inf 
robot:panda0_joint5 
hinge 
angular velocity (rad/s) 
14 

Inf 
Inf 
robot:panda0_joint6 
hinge 
angular velocity (rad/s) 
15 

Inf 
Inf 
robot:panda0_joint7 
hinge 
angle (rad) 
16 

Inf 
Inf 
robot:r_gripper_finger_joint 
slide 
linear velocity (m/s) 
17 

Inf 
Inf 
robot:l_gripper_finger_joint 
slide 
linear velocity (m/s) 
18 
Rotation of the knob for the bottom right burner 
Inf 
Inf 
knob_Joint_1 
hinge 
angle (rad) 
19 
Joint opening of the bottom right burner 
Inf 
Inf 
bottom_right_burner 
slide 
position (m) 
20 
Rotation of the knob for the bottom left burner 
Inf 
Inf 
knob_Joint_2 
hinge 
angle (rad) 
21 
Joint opening of the bottom left burner 
Inf 
Inf 
bottom_left_burner 
slide 
position (m) 
22 
Rotation of the knob for the top right burner 
Inf 
Inf 
knob_Joint_3 
hinge 
angle (rad) 
23 
Joint opening of the top right burner 
Inf 
Inf 
top_right_burner 
slide 
position (m) 
24 
Rotation of the knob for the top left burner 
Inf 
Inf 
knob_Joint_4 
hinge 
angle (rad) 
25 
Joint opening of the top left burner 
Inf 
Inf 
top_left_burner 
slide 
position (m) 
26 
Joint angle value of the overhead light switch 
Inf 
Inf 
light_switch 
slide 
position (m) 
27 
Opening of the overhead light joint 
Inf 
Inf 
light_joint 
hinge 
angle (rad) 
28 
Translation of the slide cabinet joint 
Inf 
Inf 
slide_cabinet 
slide 
position (m) 
29 
Rotation of the joint in the left hinge cabinet 
Inf 
Inf 
left_hinge_cabinet 
hinge 
angle (rad) 
30 
Rotation of the joint in the right hinge cabinet 
Inf 
Inf 
right_hinge_cabinet 
hinge 
angle (rad) 
31 
Rotation of the joint in the microwave door 
Inf 
Inf 
microwave 
hinge 
angle (rad) 
32 
Kettle’s x coordinate 
Inf 
Inf 
kettle 
free 
position (m) 
33 
Kettle’s y coordinate 
Inf 
Inf 
kettle 
free 
position (m) 
34 
Kettle’s z coordinate 
Inf 
Inf 
kettle 
free 
position (m) 
35 
Kettle’s x quaternion rotation 
Inf 
Inf 
kettle 
free 
 
36 
Kettle’s y quaternion rotation 
Inf 
Inf 
kettle 
free 
 
37 
Kettle’s z quaternion rotation 
Inf 
Inf 
kettle 
free 
 
38 
Kettle’s w quaternion rotation 
Inf 
Inf 
kettle 
free 
 
39 
Bottom right burner knob angular velocity 
Inf 
Inf 
knob_Joint_1 
hinge 
angular velocity (rad/s) 
40 
Opening linear velocity of the bottom right burner 
Inf 
Inf 
bottom_right_burner 
slide 
velocity (m/s) 
41 
Bottom left burner knob angular velocity 
Inf 
Inf 
knob_Joint_2 
hinge 
angular velocity (rad/s) 
42 
Opening linear velocity of the bottom left burner 
Inf 
Inf 
bottom_left_burner 
slide 
velocity (m/s) 
43 
Top right burner knob angular velocity 
Inf 
Inf 
knob_Joint_3 
hinge 
angular velocity (rad/s) 
44 
Opening linear velocity of the top right burner 
Inf 
Inf 
top_right_burner 
slide 
velocity (m/s) 
45 
Top left burner knob angular velocity 
Inf 
Inf 
knob_Joint_4 
hinge 
angular velocity (rad/s) 
46 
Opening linear velocity of the top left burner 
Inf 
Inf 
top_left_burner 
slide 
velocity (m/s) 
47 
Angular velocity of the overhead light switch 
Inf 
Inf 
light_switch 
slide 
velocity (m/s) 
48 
Opening linear velocity of the overhead light 
Inf 
Inf 
light_joint 
hinge 
angular velocity (rad/s) 
49 
Linear velocity of the slide cabinet joint 
Inf 
Inf 
slide_cabinet 
slide 
velocity (m/s) 
50 
Angular velocity of the left hinge cabinet joint 
Inf 
Inf 
left_hinge_cabinet 
hinge 
angular velocity (rad/s) 
51 
Angular velocity of the right hinge cabinet joint 
Inf 
Inf 
right_hinge_cabinet 
hinge 
angular velocity (rad/s) 
52 
Anular velocity of the microwave door joint 
Inf 
Inf 
microwave 
hinge 
angular velocity (rad/s) 
53 
Kettle’s x linear velocity 
Inf 
Inf 
kettle 
free 
linear velocity (m/s) 
54 
Kettle’s y linear velocity 
Inf 
Inf 
kettle 
free 
linear velocity (m/s) 
55 
Kettle’s z linear velocity 
Inf 
Inf 
kettle 
free 
linear velocity (m/s) 
56 
Kettle’s x axis angular rotation 
Inf 
Inf 
kettle 
free 
angular velocity(rad/s) 
57 
Kettle’s y axis angular rotation 
Inf 
Inf 
kettle 
free 
angular velocity(rad/s) 
58 
Kettle’s z axis angular rotation 
Inf 
Inf 
kettle 
free 
angular velocity(rad/s) 
desired_goal
: this key represents the final goal to be achieved. The value is anotherDict
space with keys the tasks to be completed in the episode and values the joint goal configuration of each joint in the task as specified in theGoal
section.achieved_goal
: this key represents the current state of the tasks. The value is anotherDict
space with keys the tasks to be completed in the episode and values the current joint configuration of each joint in the task.
Info#
The environment also returns an info
dictionary in each Gymnasium step. The keys are:
tasks_to_complete
(list[str]): list of tasks that haven’t yet been completed in the current episode.step_task_completions
(list[str]): list of tasks completed in the step taken.episode_task_completions
(list[str]): list of tasks completed during the episode uptil the current step.
Rewards#
The environment’s reward is sparse
. The reward in each Gymnasium step is equal to the number of task completed in the given step. If no task is completed the returned reward will be zero.
The tasks are considered completed when their joint configuration is within a norm threshold of 0.3
with respect to the goal configuration specified in the Goal
section.
Starting State#
The simulation starts with all of the joint position actuators of the Franka robot set to zero. The doors of the microwave and cabinets are closed, the burners turned off, and the light switch also off. The kettle will be placed in the bottom left burner.
Episode End#
The episode will be truncated
when the duration reaches a total of max_episode_steps
which by default is set to 280 timesteps.
The episode is terminated
when all the tasks have been completed unless the terminate_on_tasks_completed
argument is set to False
.
Arguments#
The following arguments can be passed when initializing the environment with gymnasium.make
kwargs:
Parameter 
Type 
Default 
Description 


list[str] 
All possible goal tasks. Go to Goal section 
The goal tasks to reach in each episode 

bool 

Terminate episode if no more tasks to complete (episodic multitask) 

bool 

Remove the completed tasks from the info dictionary returned after each step 

float 

Scaling factor applied to the uniform noise added to the kitchen object observations 

float 

Scaling factor applied to the uniform noise added to the robot joint observations 

integer 

Maximum number of steps per episode 
Version History#
v1: updated version with most recent python MuJoCo bindings.
v0: legacy versions in the D4RL.