diff --git a/autonomy/il/humanoid_il.egg-info/PKG-INFO b/autonomy/il/humanoid_il.egg-info/PKG-INFO
deleted file mode 100644
index 2369a84c..00000000
--- a/autonomy/il/humanoid_il.egg-info/PKG-INFO
+++ /dev/null
@@ -1,140 +0,0 @@
-Metadata-Version: 2.4
-Name: humanoid-il
-Version: 0.1.0
-Summary: Dataset recording for the WATO humanoid arm (ROS 2 + Isaac Sim)
-Requires-Python: >=3.10
-Description-Content-Type: text/markdown
-Requires-Dist: numpy>=1.24
-Requires-Dist: pyyaml>=6.0
-Provides-Extra: lerobot
-Requires-Dist: lerobot>=0.3.0; extra == "lerobot"
-Provides-Extra: keyboard
-Requires-Dist: pynput>=1.7; extra == "keyboard"
-Provides-Extra: ros
-Requires-Dist: opencv-python-headless>=4.8; extra == "ros"
-Provides-Extra: hdf5
-Requires-Dist: h5py>=3.8; extra == "hdf5"
-Provides-Extra: record
-Requires-Dist: lerobot>=0.3.0; extra == "record"
-Requires-Dist: pynput>=1.7; extra == "record"
-Requires-Dist: h5py>=3.8; extra == "record"
-
-# Imitation learning: record (`humanoid-record`)
-
-Record teleoperation into **LeRobot** and/or **HDF5 (LeWM)** datasets. One shared frame contract for the **real arm (ROS 2)** and **Isaac Sim keyboard teleop**.
-
-Training stays outside this repo (`lerobot-train` on the LeRobot folder). HDF5 output is compatible with [LeWorldModel / stable-worldmodel](https://galilai-group.github.io/stable-worldmodel/) style loaders (`action`, `proprio`, `pixels`, `ep_len`, `ep_offset`).
-
-## Data contract
-
-**6-DOF left arm** — same joint order as `joint_command_core.cpp`:
-
-1. shoulder pitch, roll, yaw  
-2. elbow pitch, roll  
-3. wrist pitch  
-
-| Field | Source | Units |
-|-------|--------|-------|
-| `observation.state` / `proprio` | measured joint positions | rad |
-| `action` | commanded joint targets (IK output on sim, `/behaviour/arm_pose` on robot) | rad |
-| `observation.images.*` / `pixels` | cameras in schema (optional for sim) | uint8 |
-| `task` | `--task_description` | string |
-
-## Layout
-
-```
-autonomy/il/
-├── config/
-│   ├── dataset_schema.yaml       # real robot (ROS + wrist camera)
-│   └── dataset_schema_sim.yaml   # Isaac sim (joint-only, no camera)
-├── humanoid_il/
-│   ├── snapshot.py               # ObservationSnapshot
-│   ├── frame.py                  # build_lerobot_frame()
-│   ├── recorder.py               # RecordSession (episode flags + sinks)
-│   ├── record_loop.py            # blocking loop for ROS CLI
-│   ├── sim_session.py            # helper for Isaac sim scripts
-│   ├── sinks/
-│   │   ├── lerobot.py            # Parquet + MP4
-│   │   └── hdf5_lewm.py          # trajectories.h5
-│   ├── schema.py
-│   ├── record.py                 # humanoid-record CLI
-│   └── ros_buffer.py
-└── README.md
-```
-
-## Install
-
-```bash
-cd autonomy/il
-pip install -e ".[record]"          # real arm + sim recording
-pip install -e ".[record,ros]"      # + ROS image decoding
-```
-
-## Real robot (ROS 2)
-
-```bash
-source /path/to/humanoid/install/setup.bash
-
-humanoid-record \
-  --task_description "reach forward" \
-  --num_episodes 10 \
-  --sink lerobot
-```
-
-Both formats:
-
-```bash
-humanoid-record --sink lerobot,hdf5 --num_episodes 10
-```
-
-**Keyboard** (needs `pynput`):
-
-| Key | Effect |
-|-----|--------|
-| S | Start logging frames for this episode |
-| N | Finish episode → `save_episode` |
-| D | Discard buffer, re-record same episode |
-| Esc | Abort and `finalize` |
-
-**Dry run** (no robot):
-
-```bash
-humanoid-record --dry_run --sink lerobot,hdf5 --num_episodes 2 --episode_time_s 3
-```
-
-Output: `datasets/record/001/` with LeRobot tree + `trajectories.h5`.
-
-## Isaac Sim (keyboard teleop)
-
-From `autonomy/simulation/Teleop/keyboard-based teleoperation/`:
-
-```bash
-pip install -e ../../../il[record]
-
-PYTHONPATH=$(pwd) <path>/IsaacLab/isaaclab.sh -p keyboard_teleop.py --record \
-  --sink lerobot,hdf5 \
-  --num_episodes 5 \
-  --task_description "reach and grasp"
-```
-
-Uses `config/dataset_schema_sim.yaml` (joint-only, no camera). Same S/N/D/Esc keys as real-arm recording.
-
-Output: `datasets/record_sim/001/`.
-
-## Train (LeRobot, external)
-
-```bash
-lerobot-train \
-  --dataset.repo_id=humanoid/local_left_arm \
-  --dataset.root=datasets/record/001 \
-  --policy.type=act \
-  --output_dir=outputs/train/humanoid_act_v1
-```
-
-## Sinks
-
-| `--sink` | Output | Use case |
-|----------|--------|----------|
-| `lerobot` | `data/`, `videos/`, `meta/` | `lerobot-train`, HuggingFace Hub |
-| `hdf5` | `trajectories.h5` | LeWM / JEPA world-model pipelines |
-| `lerobot,hdf5` | both under same `001/` folder | sim validation + BC training |
diff --git a/autonomy/il/humanoid_il.egg-info/SOURCES.txt b/autonomy/il/humanoid_il.egg-info/SOURCES.txt
deleted file mode 100644
index f889ff11..00000000
--- a/autonomy/il/humanoid_il.egg-info/SOURCES.txt
+++ /dev/null
@@ -1,24 +0,0 @@
-README.md
-pyproject.toml
-humanoid_il/__init__.py
-humanoid_il/arm_pose_io.py
-humanoid_il/episode_keys.py
-humanoid_il/frame.py
-humanoid_il/observation.py
-humanoid_il/record.py
-humanoid_il/record_loop.py
-humanoid_il/record_utils.py
-humanoid_il/recorder.py
-humanoid_il/ros_buffer.py
-humanoid_il/schema.py
-humanoid_il/sim_session.py
-humanoid_il/snapshot.py
-humanoid_il.egg-info/PKG-INFO
-humanoid_il.egg-info/SOURCES.txt
-humanoid_il.egg-info/dependency_links.txt
-humanoid_il.egg-info/entry_points.txt
-humanoid_il.egg-info/requires.txt
-humanoid_il.egg-info/top_level.txt
-humanoid_il/sinks/__init__.py
-humanoid_il/sinks/hdf5_lewm.py
-humanoid_il/sinks/lerobot.py
\ No newline at end of file
diff --git a/autonomy/il/humanoid_il.egg-info/dependency_links.txt b/autonomy/il/humanoid_il.egg-info/dependency_links.txt
deleted file mode 100644
index 8b137891..00000000
--- a/autonomy/il/humanoid_il.egg-info/dependency_links.txt
+++ /dev/null
@@ -1 +0,0 @@
-
diff --git a/autonomy/il/humanoid_il.egg-info/entry_points.txt b/autonomy/il/humanoid_il.egg-info/entry_points.txt
deleted file mode 100644
index c25975e6..00000000
--- a/autonomy/il/humanoid_il.egg-info/entry_points.txt
+++ /dev/null
@@ -1,2 +0,0 @@
-[console_scripts]
-humanoid-record = humanoid_il.record:main
diff --git a/autonomy/il/humanoid_il.egg-info/requires.txt b/autonomy/il/humanoid_il.egg-info/requires.txt
deleted file mode 100644
index 1fc4e758..00000000
--- a/autonomy/il/humanoid_il.egg-info/requires.txt
+++ /dev/null
@@ -1,19 +0,0 @@
-numpy>=1.24
-pyyaml>=6.0
-
-[hdf5]
-h5py>=3.8
-
-[keyboard]
-pynput>=1.7
-
-[lerobot]
-lerobot>=0.3.0
-
-[record]
-lerobot>=0.3.0
-pynput>=1.7
-h5py>=3.8
-
-[ros]
-opencv-python-headless>=4.8
diff --git a/autonomy/il/humanoid_il.egg-info/top_level.txt b/autonomy/il/humanoid_il.egg-info/top_level.txt
deleted file mode 100644
index ad2fea9f..00000000
--- a/autonomy/il/humanoid_il.egg-info/top_level.txt
+++ /dev/null
@@ -1 +0,0 @@
-humanoid_il
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/modelCfg/bimanual_arm.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/modelCfg/bimanual_arm.py
new file mode 100644
index 00000000..732e059b
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/modelCfg/bimanual_arm.py
@@ -0,0 +1,295 @@
+"""
+Bimanual arm articulation config for keyboard teleoperation.
+
+Robot model: Humanoid_Wato/wato_bimanual_arm (armDouble.SLDASM.usd)
+Joint limits/poses: Isaac Sim Physics Inspector (/World/armDouble_SLDASM/root_joint)
+Motor specs: https://watonomous.github.io/humanoid-docs/mechanical/index.html
+
+  Shoulder joints 1-2  AK10-9 V3.0  18 Nm rated / 53 Nm peak
+  Elbow joints 3-5     AK80-9 V3.0   9 Nm rated / 22 Nm peak
+  Wrist joint 6        GL40 KV70     0.25 Nm rated / 0.73 Nm peak
+  Gripper              GL40 KV70     0.25 Nm rated / 0.73 Nm peak
+                       (rotary motor + linkage → prismatic finger travel in URDF)
+
+Only the left arm is actuated for teleop. The right arm is held at the
+Physics Inspector default pose below.
+
+Gripper actuation note
+----------------------
+On hardware, ONE GL40 rotary motor closes/opens the gripper through a
+mechanical linkage (screw / hinge stack). The URDF instead exposes TWO
+independent prismatic joints (joint7 + joint8, joint7l + joint8l).
+
+In sim we:
+  1. Drive both prismatic joints with synchronized position targets (open/closed pair).
+  2. Use HIGH stiffness/damping so fingers stay locked during arm motion.
+  3. Do NOT copy 0.25 Nm directly — effort_limit_sim on prismatic DOFs is a
+     linear force cap (Newtons), not motor torque. Tune by grasp/hold behaviour.
+"""
+import math
+import os
+
+import isaaclab.sim as sim_utils
+from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.assets.articulation import ArticulationCfg
+
+_THIS_DIR = os.path.abspath(os.path.dirname(__file__))
+_BIMANUAL_ROOT = os.path.abspath(os.path.join(_THIS_DIR, "..", "..", "..", "..", "wato_bimanual_arm"))
+_ARM_USD_PATH = os.path.join(_BIMANUAL_ROOT, "urdf", "armDouble.SLDASM", "armDouble.SLDASM.usd")
+
+
+def _deg(degrees: float) -> float:
+    return degrees * math.pi / 180.0
+
+
+# --- Joint limits from Physics Inspector ------------------------------------
+_REVOLUTE_LIMIT = (-math.pi, math.pi)
+# Set physical limits so the gripper cannot close past 0.015m!
+_JOINT7_LIMIT = (-0.05, -0.015)
+_JOINT8_LIMIT = (0.015, 0.05)
+
+JOINT_POS_LIMITS = {
+    "joint1": _REVOLUTE_LIMIT,
+    "joint2": _REVOLUTE_LIMIT,
+    "joint3": _REVOLUTE_LIMIT,
+    "joint4": _REVOLUTE_LIMIT,
+    "joint5": _REVOLUTE_LIMIT,
+    "joint6": _REVOLUTE_LIMIT,
+    "joint7": _JOINT7_LIMIT,
+    "joint8": _JOINT8_LIMIT,
+    "joint1L": _REVOLUTE_LIMIT,
+    "joint2l": _REVOLUTE_LIMIT,
+    "joint3l": _REVOLUTE_LIMIT,
+    "joint4l": _REVOLUTE_LIMIT,
+    "joint5l": _REVOLUTE_LIMIT,
+    "joint6l": _REVOLUTE_LIMIT,
+    "joint7l": _JOINT7_LIMIT,
+    "joint8l": _JOINT8_LIMIT,
+}
+
+# --- Default poses from Physics Inspector (revolute: deg -> rad) ------------
+_DEFAULT_JOINT_POS = {
+    # Right arm — held fixed during teleop
+    "joint1": _deg(-140.8),
+    "joint2": _deg(55.7),
+    "joint3": _deg(-66.0),
+    "joint4": _deg(111.4),
+    "joint5": _deg(34.8),
+    "joint6": _deg(3.5),
+    "joint7": -0.05,
+    "joint8": 0.05,
+    # Left arm — teleoperated
+    "joint1L": _deg(139.2),
+    "joint2l": _deg(66.1),
+    "joint3l": _deg(147.9),
+    "joint4l": _deg(-76.5),
+    "joint5l": _deg(-76.5),
+    "joint6l": _deg(-22.6),
+    "joint7l": -0.05,
+    "joint8l": 0.05,
+}
+
+RIGHT_ARM_JOINTS = ["joint1", "joint2", "joint3", "joint4", "joint5", "joint6", "joint7", "joint8"]
+LEFT_ARM_JOINTS = ["joint1L", "joint2l", "joint3l", "joint4l", "joint5l", "joint6l"]
+LEFT_GRIPPER_JOINTS = ["joint7l", "joint8l"]
+# Jacobian anchor is the wrist link; IK pose target is the fingertip center (see below).
+LEFT_EE_BODY = "link6l"
+LEFT_FINGER_TIP_BODIES = ("link7l", "link8l")
+# Distal mesh points in each finger link frame (link7l.STL +X, link8l.STL -X).
+LEFT_FINGER_DISTAL_TIP_LOCAL = {
+    "link7l": (0.13211595, -0.04057075, -0.00434997),
+    "link8l": (-0.13211595, -0.04057075, -0.00435003),
+}
+
+# Gripper finger targets (joint7: [-0.05, 0], joint8: [0, 0.05])
+# Synchronized pair mimics single GL40 motor driving both fingers via linkage.
+GRIPPER_OPEN = {"joint7l": -0.05, "joint8l": 0.05}
+GRIPPER_CLOSED = {"joint7l": 0.0, "joint8l": 0.0}
+
+# Prismatic gripper PD — tuned for hold during arm motion (not from motor datasheet).
+# If fingers bounce when the shoulder moves, raise stiffness; if jittery, raise damping.
+_GRIPPER_STIFFNESS = 400.0
+_GRIPPER_DAMPING = 40.0
+_GRIPPER_EFFORT_LIMIT = 30.0  # N (sim linear-force cap; tune empirically)
+_GRIPPER_VELOCITY_LIMIT = 0.2  # m/s
+
+
+def apply_joint_limits(robot) -> None:
+    """Apply Physics Inspector joint limits to the articulation."""
+    limits = robot.data.joint_pos_limits.clone()
+    updated = []
+
+    for joint_idx, joint_name in enumerate(robot.data.joint_names):
+        if joint_name not in JOINT_POS_LIMITS:
+            continue
+        lo, hi = JOINT_POS_LIMITS[joint_name]
+        limits[:, joint_idx, 0] = lo
+        limits[:, joint_idx, 1] = hi
+        updated.append(joint_name)
+
+    if updated:
+        robot.write_joint_position_limit_to_sim(limits, warn_limit_violation=False)
+        print(f"[INFO] Applied joint limits for {len(updated)} joints.")
+
+
+def resolve_body_ids(robot, names: tuple[str, ...] | list[str]) -> list[int]:
+    """Map body names to articulation body indices."""
+    body_names = list(robot.data.body_names)
+    name_to_id = {name: idx for idx, name in enumerate(body_names)}
+    missing = [name for name in names if name not in name_to_id]
+    if missing:
+        raise KeyError(f"Body names {missing} not found in {body_names}")
+    return [name_to_id[name] for name in names]
+
+
+def compute_gripper_tip_pos_w(robot, finger_body_ids: list[int]):
+    """World-frame midpoint between the distal tips of link7l and link8l."""
+    import torch
+    from isaaclab.utils.math import quat_apply
+
+    dtype = robot.data.body_pos_w.dtype
+    device = robot.data.body_pos_w.device
+    tips = []
+    for body_name, body_id in zip(LEFT_FINGER_TIP_BODIES, finger_body_ids):
+        local = torch.tensor([LEFT_FINGER_DISTAL_TIP_LOCAL[body_name]], device=device, dtype=dtype)
+        body_pos = robot.data.body_pos_w[:, body_id]
+        body_quat = robot.data.body_quat_w[:, body_id]
+        tips.append(body_pos + quat_apply(body_quat, local))
+    return (tips[0] + tips[1]) * 0.5
+
+
+def compute_gripper_tip_pose_w(robot, wrist_body_id: int, finger_body_ids: list[int]):
+    """Gripper-tip center pose in world frame (position from fingers, orientation from wrist)."""
+    tip_pos_w = compute_gripper_tip_pos_w(robot, finger_body_ids)
+    tip_quat_w = robot.data.body_quat_w[:, wrist_body_id]
+    return tip_pos_w, tip_quat_w
+
+
+def compute_gripper_tip_pose_b(robot, root_pose_w, wrist_body_id: int, finger_body_ids: list[int]):
+    """Gripper-tip center pose in the robot root frame."""
+    from isaaclab.utils.math import subtract_frame_transforms
+
+    tip_pos_w, tip_quat_w = compute_gripper_tip_pose_w(robot, wrist_body_id, finger_body_ids)
+    return subtract_frame_transforms(
+        root_pose_w[:, 0:3], root_pose_w[:, 3:7], tip_pos_w, tip_quat_w
+    )
+
+
+def jacobian_world_to_root(robot, jacobian_w):
+    """Rotate PhysX world-frame Jacobian into the articulation root frame."""
+    import torch
+    from isaaclab.utils.math import matrix_from_quat, quat_inv
+
+    base_rot = matrix_from_quat(quat_inv(robot.data.root_quat_w))
+    jacobian_b = jacobian_w.clone()
+    jacobian_b[:, :3, :] = torch.bmm(base_rot, jacobian_b[:, :3, :])
+    jacobian_b[:, 3:, :] = torch.bmm(base_rot, jacobian_b[:, 3:, :])
+    return jacobian_b
+
+
+def adjust_jacobian_for_gripper_tip(jacobian_b, wrist_pos_b, tip_pos_b):
+    """Map link6l root-frame Jacobian to the fingertip center."""
+    import torch
+    from isaaclab.utils.math import skew_symmetric_matrix
+
+    offset_b = tip_pos_b - wrist_pos_b
+    tip_jacobian = jacobian_b.clone()
+    tip_jacobian[:, 0:3, :] += torch.bmm(-skew_symmetric_matrix(offset_b), jacobian_b[:, 3:, :])
+    return tip_jacobian
+
+
+def compute_tip_ik_jacobian(robot, jacobian_w, wrist_pos_b, tip_pos_b):
+    """World-frame link6l Jacobian -> root frame -> fingertip center."""
+    return adjust_jacobian_for_gripper_tip(
+        jacobian_world_to_root(robot, jacobian_w), wrist_pos_b, tip_pos_b
+    )
+
+
+def resolve_joint_name(robot, name: str) -> str:
+    """Match a config joint name to the articulation's actual joint name."""
+    names = list(robot.data.joint_names)
+    if name in names:
+        return name
+    # Physics Inspector may show joint2L while URDF uses joint2l
+    if len(name) > 1 and name[-1] in ("l", "L"):
+        alt = name[:-1] + ("L" if name[-1] == "l" else "l")
+        if alt in names:
+            return alt
+    raise KeyError(f"Joint '{name}' not found in {names}")
+
+
+_ARM_URDF_PATH = os.path.join(_BIMANUAL_ROOT, "urdf", "armDouble.SLDASM.urdf")
+
+BIMANUAL_ARM_CFG = ArticulationCfg(
+    spawn=sim_utils.UrdfFileCfg(
+        asset_path=_ARM_URDF_PATH,
+        fix_base=True,
+        make_instanceable=True,
+        joint_drive=sim_utils.UrdfFileCfg.JointDriveCfg(
+            drive_type="force",
+            target_type="position",
+            gains=sim_utils.UrdfFileCfg.JointDriveCfg.PDGainsCfg(
+                stiffness=400.0,
+                damping=40.0,
+            ),
+        ),
+        rigid_props=sim_utils.RigidBodyPropertiesCfg(
+            disable_gravity=False,
+            max_depenetration_velocity=5.0,
+        ),
+        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
+            enabled_self_collisions=False,
+        ),
+    ),
+    init_state=ArticulationCfg.InitialStateCfg(joint_pos=_DEFAULT_JOINT_POS),
+    actuators={
+        # AK10-9 V3.0 — shoulder joints 1-2
+        "left_shoulder": ImplicitActuatorCfg(
+            joint_names_expr=["joint1L", "joint2l"],
+            stiffness=757.6,
+            damping=60.3,
+            effort_limit_sim=18.0,
+            velocity_limit_sim=3.0,
+        ),
+        # AK80-9 V3.0 — elbow joints 3-5
+        "left_elbow": ImplicitActuatorCfg(
+            joint_names_expr=["joint3l", "joint4l", "joint5l"],
+            stiffness=615.5,
+            damping=43.5,
+            effort_limit_sim=9.0,
+            velocity_limit_sim=3.0,
+        ),
+        # GL40 KV70 — wrist joint 6
+        "left_wrist": ImplicitActuatorCfg(
+            joint_names_expr=["joint6l"],
+            stiffness=170.5,
+            damping=9.0,
+            effort_limit_sim=0.25,
+            velocity_limit_sim=3.0,
+        ),
+        # GL40 KV70 rotary → linkage → two prismatic fingers (see module docstring)
+        "left_gripper": ImplicitActuatorCfg(
+            joint_names_expr=["joint7l", "joint8l"],
+            stiffness=_GRIPPER_STIFFNESS,
+            damping=_GRIPPER_DAMPING,
+            effort_limit_sim=_GRIPPER_EFFORT_LIMIT,
+            velocity_limit_sim=_GRIPPER_VELOCITY_LIMIT,
+        ),
+        # Right arm revolute joints — hold Physics Inspector default pose
+        "right_arm": ImplicitActuatorCfg(
+            joint_names_expr=["joint1", "joint2", "joint3", "joint4", "joint5", "joint6"],
+            stiffness=500.0,
+            damping=50.0,
+            effort_limit_sim=18.0,
+            velocity_limit_sim=3.0,
+        ),
+        # Right gripper — same coupled-prismatic hold as left
+        "right_gripper": ImplicitActuatorCfg(
+            joint_names_expr=["joint7", "joint8"],
+            stiffness=_GRIPPER_STIFFNESS,
+            damping=_GRIPPER_DAMPING,
+            effort_limit_sim=_GRIPPER_EFFORT_LIMIT,
+            velocity_limit_sim=_GRIPPER_VELOCITY_LIMIT,
+        ),
+    },
+)
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/__init__.py
index d49bebc9..dab05c40 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/__init__.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/__init__.py
@@ -1,5 +1,8 @@
 from isaaclab_tasks.utils import import_packages
 
 # The blacklist is used to prevent importing configs from sub-packages
-_BLACKLIST_PKGS = ["utils"]
+_BLACKLIST_PKGS = [
+    "utils",
+    "dextrah",   # requires RslRlPpoActorCriticRecurrentCfg which is not in this isaaclab_rl build
+]
 import_packages(__name__, _BLACKLIST_PKGS)
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/__init__.py
new file mode 100644
index 00000000..7ea0d715
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/__init__.py
@@ -0,0 +1,6 @@
+# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+"""Manipulation environments to open drawers in a cabinet."""
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/bimanual_cabinet_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/bimanual_cabinet_env_cfg.py
new file mode 100644
index 00000000..5b370f74
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/bimanual_cabinet_env_cfg.py
@@ -0,0 +1,332 @@
+# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+
+import isaaclab.sim as sim_utils
+from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.assets import ArticulationCfg, AssetBaseCfg
+from isaaclab.envs import ManagerBasedRLEnvCfg
+from isaaclab.managers import EventTermCfg as EventTerm
+from isaaclab.managers import ObservationGroupCfg as ObsGroup
+from isaaclab.managers import ObservationTermCfg as ObsTerm
+from isaaclab.managers import RewardTermCfg as RewTerm
+from isaaclab.managers import SceneEntityCfg
+from isaaclab.managers import TerminationTermCfg as DoneTerm
+from isaaclab.managers import CurriculumTermCfg as CurrTerm
+from isaaclab.scene import InteractiveSceneCfg
+from isaaclab.sensors import FrameTransformerCfg
+from isaaclab.sensors.frame_transformer import OffsetCfg
+from isaaclab.utils import configclass
+from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
+
+from . import mdp
+from HumanoidRLPackage.HumanoidRLSetup.modelCfg.bimanual_arm import BIMANUAL_ARM_CFG
+
+##
+# Pre-defined configs
+##
+from isaaclab.markers.config import FRAME_MARKER_CFG  # isort: skip
+
+
+FRAME_MARKER_SMALL_CFG = FRAME_MARKER_CFG.copy()
+FRAME_MARKER_SMALL_CFG.markers["frame"].scale = (0.10, 0.10, 0.10)
+
+
+##
+# Scene definition
+##
+
+
+@configclass
+class CabinetSceneCfg(InteractiveSceneCfg):
+    """Configuration for the cabinet scene with a robot and a cabinet.
+
+    This is the abstract base implementation, the exact scene is defined in the derived classes
+    which need to set the robot and end-effector frames
+    """
+
+    robot = BIMANUAL_ARM_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
+
+    cabinet = ArticulationCfg(
+        prim_path="{ENV_REGEX_NS}/Cabinet",
+        spawn=sim_utils.UsdFileCfg(
+            usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Sektion_Cabinet/sektion_cabinet_instanceable.usd",
+            activate_contact_sensors=False,
+            scale=(1.15, 1.15, 1.15),
+        ),
+        init_state=ArticulationCfg.InitialStateCfg(
+            pos=(0.75, 0.0, 0.6),
+            rot=(0.0, 0.0, 0.0, 1.0),
+            joint_pos={
+                "door_left_joint": 0.0,
+                "door_right_joint": 0.0,
+                "drawer_bottom_joint": 0.0,
+                "drawer_top_joint": 0.0,
+            },
+        ),
+        actuators={
+            "drawers": ImplicitActuatorCfg(
+                joint_names_expr=["drawer_top_joint", "drawer_bottom_joint"],
+                effort_limit_sim=87.0,
+                stiffness=10.0,
+                damping=1.0,
+            ),
+            "doors": ImplicitActuatorCfg(
+                joint_names_expr=["door_left_joint", "door_right_joint"],
+                effort_limit_sim=87.0,
+                stiffness=10.0,
+                damping=2.5,
+            ),
+        },
+    )
+
+    # Frame definitions for the cabinet.
+    cabinet_frame = FrameTransformerCfg(
+        prim_path="{ENV_REGEX_NS}/Cabinet/sektion",
+        debug_vis=True,
+        visualizer_cfg=FRAME_MARKER_SMALL_CFG.replace(prim_path="/Visuals/CabinetFrameTransformer"),
+        target_frames=[
+            FrameTransformerCfg.FrameCfg(
+                prim_path="{ENV_REGEX_NS}/Cabinet/drawer_handle_top",
+                name="drawer_handle_top",
+                offset=OffsetCfg(
+                    pos=(0.3485, 0.0, 0.01),
+                    rot=(0.5, 0.5, -0.5, -0.5),  # align with end-effector frame
+                ),
+            ),
+        ],
+    )
+
+    # plane
+    plane = AssetBaseCfg(
+        prim_path="/World/GroundPlane",
+        init_state=AssetBaseCfg.InitialStateCfg(),
+        spawn=sim_utils.GroundPlaneCfg(),
+        collision_group=-1,
+    )
+
+    # lights
+    light = AssetBaseCfg(
+        prim_path="/World/light",
+        spawn=sim_utils.DomeLightCfg(color=(0.75, 0.75, 0.75), intensity=3000.0),
+    )
+
+
+##
+# MDP settings
+##
+
+
+@configclass
+class ActionsCfg:
+    """Action specifications for the MDP."""
+
+    arm_action = mdp.JointPositionActionCfg(
+        asset_name="robot",
+        joint_names=["joint1", "joint2", "joint3", "joint4", "joint5", "joint6"],
+        scale=0.8,
+        use_default_offset=True,
+    )
+    gripper_action = mdp.BinaryJointPositionActionCfg(
+        asset_name="robot",
+        joint_names=["joint7", "joint8"],
+        open_command_expr={"joint7": -0.06, "joint8": 0.06},
+        close_command_expr={"joint7": 0.0, "joint8": 0.0},
+    )
+    # Dummy action to actively hold the left arm at its resting pose
+    left_arm_hold = mdp.JointPositionActionCfg(
+        asset_name="robot",
+        joint_names=["joint1L", "joint2l", "joint3l", "joint4l", "joint5l", "joint6l", "joint7l", "joint8l"],
+        scale=0.0,  # Neural network output is multiplied by zero
+        use_default_offset=True,  # Always targets the default resting position
+    )
+
+
+@configclass
+class ObservationsCfg:
+    """Observation specifications for the MDP."""
+
+    @configclass
+    class PolicyCfg(ObsGroup):
+        """Observations for policy group."""
+
+        joint_pos = ObsTerm(func=mdp.joint_pos_rel)
+        joint_vel = ObsTerm(func=mdp.joint_vel_rel)
+        cabinet_joint_pos = ObsTerm(
+            func=mdp.joint_pos_rel,
+            params={"asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"])},
+        )
+        cabinet_joint_vel = ObsTerm(
+            func=mdp.joint_vel_rel,
+            params={"asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"])},
+        )
+        rel_ee_drawer_distance = ObsTerm(func=mdp.rel_ee_drawer_distance)
+
+        actions = ObsTerm(func=mdp.last_action)
+
+        def __post_init__(self):
+            self.enable_corruption = True
+            self.concatenate_terms = True
+
+    # observation groups
+    policy: PolicyCfg = PolicyCfg()
+
+
+@configclass
+class EventCfg:
+    """Configuration for events."""
+
+    robot_physics_material = EventTerm(
+        func=mdp.randomize_rigid_body_material,
+        mode="startup",
+        params={
+            "asset_cfg": SceneEntityCfg("robot", body_names=".*"),
+            "static_friction_range": (0.8, 1.25),
+            "dynamic_friction_range": (0.8, 1.25),
+            "restitution_range": (0.0, 0.0),
+            "num_buckets": 16,
+        },
+    )
+
+    cabinet_physics_material = EventTerm(
+        func=mdp.randomize_rigid_body_material,
+        mode="startup",
+        params={
+            "asset_cfg": SceneEntityCfg("cabinet", body_names="drawer_handle_top"),
+            "static_friction_range": (2.0, 2.5),
+            "dynamic_friction_range": (2.0, 2.25),
+            "restitution_range": (0.0, 0.0),
+            "num_buckets": 16,
+        },
+    )
+
+    reset_all = EventTerm(func=mdp.reset_scene_to_default, mode="reset")
+
+    reset_robot_joints = EventTerm(
+        func=mdp.reset_joints_by_offset,
+        mode="reset",
+        params={
+            "asset_cfg": SceneEntityCfg(
+                "robot", 
+                joint_names=["joint1", "joint2", "joint3", "joint4", "joint5", "joint6", "joint7", "joint8"]
+            ),
+            "position_range": (-0.1, 0.1),
+            "velocity_range": (0.0, 0.0),
+        },
+    )
+
+
+@configclass
+class RewardsCfg:
+    """Reward terms for the MDP."""
+
+    # 1. Approach the handle
+    approach_ee_handle = RewTerm(func=mdp.approach_ee_handle, weight=2.0, params={"threshold": 0.2})
+    align_ee_handle = RewTerm(func=mdp.align_ee_handle, weight=0.5)
+
+    # 2. Grasp the handle
+    approach_gripper_handle = RewTerm(
+        func=mdp.approach_gripper_handle, weight=5.0, params={"offset": 0.04})
+    align_grasp_around_handle = RewTerm(func=mdp.align_grasp_around_handle, weight=0.125)
+    
+    straddle_handle = RewTerm(
+        func=mdp.straddle_handle,
+        weight=25.0,
+        params={"threshold": 0.05}
+    )
+
+    grasp_handle = RewTerm(
+        func=mdp.grasp_handle,
+        weight=100.0,
+        params={
+            "threshold": 0.03,
+            "open_joint_pos": 0.06,
+            "asset_cfg": SceneEntityCfg("robot", joint_names=["joint7", "joint8"]),
+        },
+    )
+
+    # 3. Open the drawer
+    open_drawer_bonus = RewTerm(
+        func=mdp.open_drawer_bonus,
+        weight=200.0,
+        params={"asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"])},
+    )
+    multi_stage_open_drawer = RewTerm(
+        func=mdp.multi_stage_open_drawer,
+        weight=50.0,
+        params={"asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"])},
+    )
+
+    # 4. Penalize actions for cosmetic reasons (Conditional Jerk Penalties)
+    action_rate_l2 = RewTerm(
+        func=mdp.conditional_action_rate_l2, 
+        weight=-0.05,
+        params={"asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"])},
+    )
+    joint_vel = RewTerm(
+        func=mdp.conditional_joint_vel_l2, 
+        weight=-0.01,
+        params={
+            "asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"]),
+            "robot_cfg": SceneEntityCfg("robot", joint_names=["joint1", "joint2", "joint3", "joint4", "joint5", "joint6", "joint7", "joint8"])
+        },
+    )
+
+
+@configclass
+class TerminationsCfg:
+    """Termination terms for the MDP."""
+
+    time_out = DoneTerm(func=mdp.time_out, time_out=True)
+    success = DoneTerm(
+        func=mdp.joint_pos_out_of_manual_limit,
+        params={"asset_cfg": SceneEntityCfg("cabinet", joint_names=["drawer_top_joint"]), "bounds": (0.0, 0.39)}
+    )
+
+@configclass
+class CurriculumCfg:
+    """Curriculum terms for the MDP."""
+    pass
+
+
+##
+# Environment configuration
+##
+
+
+@configclass
+class CabinetEnvCfg(ManagerBasedRLEnvCfg):
+    """Configuration for the cabinet environment."""
+
+    # Scene settings
+    scene: CabinetSceneCfg = CabinetSceneCfg(num_envs=4096, env_spacing=2.0)
+    # Basic settings
+    observations: ObservationsCfg = ObservationsCfg()
+    actions: ActionsCfg = ActionsCfg()
+    # MDP settings
+    rewards: RewardsCfg = RewardsCfg()
+    terminations: TerminationsCfg = TerminationsCfg()
+    events: EventCfg = EventCfg()
+    curriculum: CurriculumCfg = CurriculumCfg()
+
+    def __post_init__(self):
+        """Post initialization."""
+        # general settings
+        self.decimation = 1
+        self.episode_length_s = 8.0
+        self.viewer.eye = (-2.0, 2.0, 2.0)
+        self.viewer.lookat = (0.8, 0.0, 0.5)
+        # simulation settings
+        self.sim.dt = 1 / 60  # 60Hz
+        self.sim.render_interval = self.decimation
+        self.sim.physx.bounce_threshold_velocity = 0.2
+        self.sim.physx.bounce_threshold_velocity = 0.01
+        self.sim.physx.friction_correlation_distance = 0.00625
+        self.scene.robot.init_state.pos = (-0.1, 0.0, 0.4)
+
+
+# PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/train.py --task=Isaac-Open-Drawer-Humanoid-Arm-v0 --headless
+
+# PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/play.py --task=Isaac-Open-Drawer-Humanoid-Arm-v0 --num_envs=1
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/__init__.py
new file mode 100644
index 00000000..235b81c3
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/__init__.py
@@ -0,0 +1,23 @@
+import gymnasium as gym
+
+from . import agents
+
+gym.register(
+    id="Isaac-Open-Drawer-Bimanual-Arm-v0",
+    entry_point="isaaclab.envs:ManagerBasedRLEnv",
+    disable_env_checker=True,
+    kwargs={
+        "env_cfg_entry_point": f"{__name__}.joint_pos_env_cfg:BimanualArmCabinetEnvCfg",
+        "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:BimanualArmCabinetPPORunnerCfg",
+    },
+)
+
+gym.register(
+    id="Isaac-Open-Drawer-Bimanual-Arm-Play-v0",
+    entry_point="isaaclab.envs:ManagerBasedRLEnv",
+    disable_env_checker=True,
+    kwargs={
+        "env_cfg_entry_point": f"{__name__}.joint_pos_env_cfg:BimanualArmCabinetEnvCfg_PLAY",
+        "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:BimanualArmCabinetPPORunnerCfg",
+    },
+)
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/agents/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/agents/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py
new file mode 100644
index 00000000..738466ac
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py
@@ -0,0 +1,35 @@
+from isaaclab.utils import configclass
+from isaaclab_rl.rsl_rl import (
+    RslRlOnPolicyRunnerCfg,
+    RslRlPpoActorCriticCfg,
+    RslRlPpoAlgorithmCfg,
+)
+
+
+@configclass
+class BimanualArmCabinetPPORunnerCfg(RslRlOnPolicyRunnerCfg):
+    num_steps_per_env = 96
+    max_iterations = 500
+    save_interval = 50
+    experiment_name = "bimanual_cabinet"
+    empirical_normalization = True
+    policy = RslRlPpoActorCriticCfg(
+        init_noise_std=1.0,
+        actor_hidden_dims=[256, 128, 64],
+        critic_hidden_dims=[256, 128, 64],
+        activation="elu",
+    )
+    algorithm = RslRlPpoAlgorithmCfg(
+        value_loss_coef=1.0,
+        use_clipped_value_loss=True,
+        clip_param=0.2,
+        entropy_coef=0.0,  # Set to 0.0 to force the AI to rely on its deterministic brain instead of random noise!
+        num_learning_epochs=5,
+        num_mini_batches=4,
+        learning_rate=5.0e-4,
+        schedule="adaptive",
+        gamma=0.99,
+        lam=0.95,
+        desired_kl=0.02,
+        max_grad_norm=1.0,
+    )
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/joint_pos_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/joint_pos_env_cfg.py
new file mode 100644
index 00000000..73013a80
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/HumanoidRLEnv/joint_pos_env_cfg.py
@@ -0,0 +1,18 @@
+from isaaclab.utils import configclass
+
+from HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.bimanual_cabinet_env_cfg import CabinetEnvCfg
+
+
+@configclass
+class BimanualArmCabinetEnvCfg(CabinetEnvCfg):
+    """Cabinet open-drawer task for the bimanual arm."""
+    pass
+
+
+@configclass
+class BimanualArmCabinetEnvCfg_PLAY(BimanualArmCabinetEnvCfg):
+    def __post_init__(self):
+        super().__post_init__()
+        self.scene.num_envs = 50
+        self.scene.env_spacing = 2.5
+        self.observations.policy.enable_corruption = False
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/__init__.py
new file mode 100644
index 00000000..d7c38f5c
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/config/__init__.py
@@ -0,0 +1,9 @@
+# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+"""Configurations for the cabinet environments."""
+
+# We leave this file empty since we don't want to expose any configs in this package directly.
+# We still need this file to import the "config" module in the parent package.
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/__init__.py
new file mode 100644
index 00000000..79a9af2f
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/__init__.py
@@ -0,0 +1,11 @@
+# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+"""This sub-module contains the functions that are specific to the cabinet environments."""
+
+from isaaclab.envs.mdp import *  # noqa: F401, F403
+
+from .observations import *  # noqa: F401, F403
+from .rewards import *  # noqa: F401, F403
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/observations.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/observations.py
new file mode 100644
index 00000000..5016ded6
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/observations.py
@@ -0,0 +1,66 @@
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+import torch
+
+import isaaclab.utils.math as math_utils
+from isaaclab.assets import ArticulationData
+from isaaclab.sensors import FrameTransformerData
+
+if TYPE_CHECKING:
+    from isaaclab.envs import ManagerBasedRLEnv
+
+
+def _robot_ee_pos(env: ManagerBasedRLEnv) -> torch.Tensor | None:
+    """End-effector position from robot articulation (body DIP_INDEX). Returns None if not ready."""
+    robot = env.scene["robot"]
+    ids, _ = robot.find_bodies("link7", preserve_order=True)
+    if not ids:
+        return None
+    return robot.data.body_pos_w[:, ids[0], :]
+
+
+def rel_ee_object_distance(env: ManagerBasedRLEnv) -> torch.Tensor:
+    """The distance between the end-effector and the object."""
+    ee_tf_data: FrameTransformerData = env.scene["ee_frame"].data
+    object_data: ArticulationData = env.scene["object"].data
+
+    return object_data.root_pos_w - ee_tf_data.target_pos_w[..., 0, :]
+
+
+def rel_ee_drawer_distance(env: ManagerBasedRLEnv) -> torch.Tensor:
+    """The distance between the end-effector and the object."""
+    ee_pos = _robot_ee_pos(env)
+    if ee_pos is None:
+        return torch.zeros(env.num_envs, 3, device=env.device)
+    cabinet_tf_data: FrameTransformerData = env.scene["cabinet_frame"].data
+    handle_pos = cabinet_tf_data.target_pos_w[..., 0, :]
+    return handle_pos - ee_pos
+
+
+def fingertips_pos(env: ManagerBasedRLEnv) -> torch.Tensor:
+    """The position of the fingertips relative to the environment origins."""
+    ee_tf_data: FrameTransformerData = env.scene["ee_frame"].data
+    fingertips_pos = ee_tf_data.target_pos_w[..., 1:, :] - env.scene.env_origins.unsqueeze(1)
+
+    return fingertips_pos.view(env.num_envs, -1)
+
+
+def ee_pos(env: ManagerBasedRLEnv) -> torch.Tensor:
+    """The position of the end-effector relative to the environment origins."""
+    ee_tf_data: FrameTransformerData = env.scene["ee_frame"].data
+    ee_pos = ee_tf_data.target_pos_w[..., 0, :] - env.scene.env_origins
+
+    return ee_pos
+
+
+def ee_quat(env: ManagerBasedRLEnv, make_quat_unique: bool = True) -> torch.Tensor:
+    """The orientation of the end-effector in the environment frame.
+
+    If :attr:`make_quat_unique` is True, the quaternion is made unique by ensuring the real part is positive.
+    """
+    ee_tf_data: FrameTransformerData = env.scene["ee_frame"].data
+    ee_quat = ee_tf_data.target_quat_w[..., 0, :]
+    # make first element of quaternion positive
+    return math_utils.quat_unique(ee_quat) if make_quat_unique else ee_quat
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/rewards.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/rewards.py
new file mode 100644
index 00000000..38fffa08
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/mdp/rewards.py
@@ -0,0 +1,301 @@
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+import torch
+
+from isaaclab.managers import SceneEntityCfg
+from isaaclab.utils.math import matrix_from_quat
+
+if TYPE_CHECKING:
+    from isaaclab.envs import ManagerBasedRLEnv
+
+
+def _robot_ee_pose(env: ManagerBasedRLEnv):
+    """(ee_tcp_pos, ee_tcp_quat, lfinger_pos, rfinger_pos) from robot articulation bodies. Uses body name regex like reach."""
+    robot = env.scene["robot"]
+    # For bimanual arm, the right hand fingers are link7 and link8
+    ee_ids, _ = robot.find_bodies("link7", preserve_order=True)
+    thumb_ids, _ = robot.find_bodies("link8", preserve_order=True)
+    if not ee_ids or not thumb_ids:
+        return None
+    lfinger_pos = robot.data.body_pos_w[:, ee_ids[0], :]
+    rfinger_pos = robot.data.body_pos_w[:, thumb_ids[0], :]
+    # The true End-Effector TCP is exactly halfway between the two fingers!
+    ee_tcp_pos = (lfinger_pos + rfinger_pos) / 2.0
+    ee_tcp_quat = robot.data.body_quat_w[:, ee_ids[0], :]
+    return (ee_tcp_pos, ee_tcp_quat, lfinger_pos, rfinger_pos)
+
+
+def approach_ee_handle(env: ManagerBasedRLEnv, threshold: float) -> torch.Tensor:
+    r"""Reward the robot for reaching the drawer handle using inverse-square law.
+
+    It uses a piecewise function to reward the robot for reaching the handle.
+
+    .. math::
+
+        reward = \begin{cases}
+            2 * (1 / (1 + distance^2))^2 & \text{if } distance \leq threshold \\
+            (1 / (1 + distance^2))^2 & \text{otherwise}
+        \end{cases}
+
+    """
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device)
+    ee_tcp_pos, _, _, _ = pose
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+
+    distance = torch.norm(handle_pos - ee_tcp_pos, dim=-1, p=2)
+    reward = 1.0 / (1.0 + distance**2)
+    reward = torch.pow(reward, 2)
+    return torch.where(distance <= threshold, 2 * reward, reward)
+
+
+def align_ee_handle(env: ManagerBasedRLEnv) -> torch.Tensor:
+    """Reward for aligning the end-effector with the handle."""
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device)
+    _, ee_tcp_quat, _, _ = pose
+    handle_quat = env.scene["cabinet_frame"].data.target_quat_w[..., 0, :]
+
+    ee_tcp_rot_mat = matrix_from_quat(ee_tcp_quat)
+    handle_mat = matrix_from_quat(handle_quat)
+
+    handle_x, handle_y = handle_mat[..., 0], handle_mat[..., 1]
+    ee_tcp_x, ee_tcp_z = ee_tcp_rot_mat[..., 0], ee_tcp_rot_mat[..., 2]
+
+    align_z = torch.bmm(ee_tcp_z.unsqueeze(1), -handle_x.unsqueeze(-1)).squeeze(-1).squeeze(-1)
+    align_x = torch.bmm(ee_tcp_x.unsqueeze(1), -handle_y.unsqueeze(-1)).squeeze(-1).squeeze(-1)
+    return 0.5 * (torch.sign(align_z) * align_z**2 + torch.sign(align_x) * align_x**2)
+
+
+def align_grasp_around_handle(env: ManagerBasedRLEnv) -> torch.Tensor:
+    """Bonus for correct hand orientation around the handle."""
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device, dtype=torch.bool)
+    _, _, lfinger_pos, rfinger_pos = pose
+
+    is_graspable = (rfinger_pos[:, 2] < handle_pos[:, 2]) & (lfinger_pos[:, 2] > handle_pos[:, 2])
+    return is_graspable
+
+
+def approach_gripper_handle(env: ManagerBasedRLEnv, offset: float = 0.04) -> torch.Tensor:
+    """Reward the robot's gripper reaching the drawer handle with the right pose."""
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device)
+    _, _, lfinger_pos, rfinger_pos = pose
+
+    lfinger_dist = torch.abs(lfinger_pos[:, 2] - handle_pos[:, 2])
+    rfinger_dist = torch.abs(rfinger_pos[:, 2] - handle_pos[:, 2])
+    is_graspable = (rfinger_pos[:, 2] < handle_pos[:, 2]) & (lfinger_pos[:, 2] > handle_pos[:, 2])
+
+    return is_graspable * ((offset - lfinger_dist) + (offset - rfinger_dist))
+
+
+def grasp_handle(
+    env: ManagerBasedRLEnv, threshold: float, open_joint_pos: float, asset_cfg: SceneEntityCfg
+) -> torch.Tensor:
+    """Reward for closing the fingers when being close to the handle.
+
+    The :attr:`threshold` is the distance from the handle at which the fingers should be closed.
+    The :attr:`open_joint_pos` is the joint position when the fingers are open.
+
+    Note:
+        It is assumed that zero joint position corresponds to the fingers being closed.
+    """
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device)
+    ee_tcp_pos, _, _, _ = pose
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+    gripper_joint_pos = env.scene[asset_cfg.name].data.joint_pos[:, asset_cfg.joint_ids]
+
+    distance = torch.norm(handle_pos - ee_tcp_pos, dim=-1, p=2)
+    is_close = distance <= threshold
+
+    # The physics engine will now naturally clamp gripper_joint_pos to 0.015!
+    return is_close * torch.sum(open_joint_pos - torch.abs(gripper_joint_pos), dim=-1)
+
+
+def open_drawer_bonus(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg) -> torch.Tensor:
+    """Bonus for opening the drawer given by the joint position of the drawer.
+    Gives a MASSIVE multiplier if the robot's hand is physically on the handle while it opens!
+    """
+    drawer_pos = env.scene[asset_cfg.name].data.joint_pos[:, asset_cfg.joint_ids[0]]
+    
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return drawer_pos
+        
+    ee_tcp_pos, _, lfinger_pos, rfinger_pos = pose
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+    
+    # Use the exact Z-coordinate of the offset point!
+    target_z = handle_pos[..., 2]
+    dz = torch.abs(target_z - ee_tcp_pos[..., 2])
+    
+    # X/Y distance
+    dx_dy = torch.norm(handle_pos[..., :2] - ee_tcp_pos[..., :2], dim=-1, p=2)
+    
+    # ASYMMETRIC GRADIENT: The top lip of the drawer is physically higher than the handle.
+    # If the robot's hand goes higher than target_z + 2cm, it gets EXACTLY 0 points!
+    # If it is below or at the handle, it gets the continuous breadcrumb trail.
+    is_too_high = ee_tcp_pos[..., 2] > (target_z + 0.02)
+    z_score = torch.where(is_too_high, torch.zeros_like(dz), torch.exp(-50.0 * dz))
+    
+    # Wide breadcrumb trail to get the hand generally near the handle
+    xy_score = torch.where(dx_dy <= 0.15, torch.exp(-20.0 * dx_dy), torch.zeros_like(dx_dy))
+    
+    # BULLSEYE MULTIPLIER: Gives up to 5x MORE points for pulling from the exact middle!
+    # Creates an ultra-sharp gravitational well directly in the center of the handle.
+    bullseye_multiplier = 1.0 + torch.where(dx_dy <= 0.04, 4.0 * torch.exp(-150.0 * dx_dy), torch.zeros_like(dx_dy))
+    
+    is_close = z_score * xy_score * bullseye_multiplier
+    
+    # Check if the claw is actually closed!
+    # Fully open is ~10cm apart. Fully closed is ~3cm apart. 
+    finger_dist = torch.norm(lfinger_pos - rfinger_pos, dim=-1, p=2)
+    is_claw_closed = (finger_dist < 0.06).float()
+    
+    # 1x points for pulling with handle, 10x AS MANY points if claw is closed!
+    claw_multiplier = 1.0 + (is_claw_closed * 9.0)
+
+    # EXACTLY 0 points if it is pulling from the top of the shelf!
+    return is_close * drawer_pos * claw_multiplier
+
+
+def straddle_handle(env: ManagerBasedRLEnv, threshold: float) -> torch.Tensor:
+    """Reward the robot for threading its fingers on strictly opposite sides of the handle.
+    This mathematically guarantees the handle is trapped INSIDE the grip, rather than pinched from outside.
+    """
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device)
+        
+    ee_tcp_pos, _, lfinger_pos, rfinger_pos = pose
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+    
+    # Only reward straddling if the hand is physically near the handle
+    distance_to_handle = torch.norm(handle_pos - ee_tcp_pos, dim=-1, p=2)
+    is_close = (distance_to_handle <= threshold).float()
+    
+    # Calculate Euclidean distances
+    dist_thumb_to_handle = torch.norm(handle_pos - rfinger_pos, dim=-1, p=2)
+    dist_index_to_handle = torch.norm(handle_pos - lfinger_pos, dim=-1, p=2)
+    dist_thumb_to_index = torch.norm(lfinger_pos - rfinger_pos, dim=-1, p=2)
+    
+    # If the handle is PERFECTLY between the two fingers, then:
+    # dist(thumb, handle) + dist(index, handle) == dist(thumb, index)
+    # We penalize any deviation from this perfect straight line!
+    linearity_deviation = (dist_thumb_to_handle + dist_index_to_handle) - dist_thumb_to_index
+    
+    # Convert deviation to a 0-to-1 score (1.0 means perfectly straddled)
+    # We use a sharp exponential curve so it only gets points if it's highly accurate
+    straddle_score = torch.exp(-50.0 * linearity_deviation)
+    
+    return is_close * straddle_score
+
+
+def multi_stage_open_drawer(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg) -> torch.Tensor:
+    """Multi-stage bonus for opening the drawer.
+
+    Depending on the drawer's position, the reward is given in three stages: easy, medium, and hard.
+    This helps the agent to learn to open the drawer in a controlled manner.
+    """
+    drawer_pos = env.scene[asset_cfg.name].data.joint_pos[:, asset_cfg.joint_ids[0]]
+    
+    pose = _robot_ee_pose(env)
+    if pose is None:
+        return torch.zeros(env.num_envs, device=env.device)
+        
+    ee_tcp_pos, _, lfinger_pos, rfinger_pos = pose
+    handle_pos = env.scene["cabinet_frame"].data.target_pos_w[..., 0, :]
+    
+    # Use the exact Z-coordinate of the offset point!
+    target_z = handle_pos[..., 2]
+    dz = torch.abs(target_z - ee_tcp_pos[..., 2])
+    
+    # X/Y distance
+    dx_dy = torch.norm(handle_pos[..., :2] - ee_tcp_pos[..., :2], dim=-1, p=2)
+    
+    # ASYMMETRIC GRADIENT to ban the top lip
+    is_too_high = ee_tcp_pos[..., 2] > (target_z + 0.02)
+    z_score = torch.where(is_too_high, torch.zeros_like(dz), torch.exp(-50.0 * dz))
+    
+    # Wide breadcrumb trail
+    xy_score = torch.where(dx_dy <= 0.15, torch.exp(-20.0 * dx_dy), torch.zeros_like(dx_dy))
+    
+    # BULLSEYE MULTIPLIER: 5x points for exact middle
+    bullseye_multiplier = 1.0 + torch.where(dx_dy <= 0.04, 4.0 * torch.exp(-150.0 * dx_dy), torch.zeros_like(dx_dy))
+    
+    is_close = z_score * xy_score * bullseye_multiplier
+    
+    # Claw closed multiplier (10x)
+    finger_dist = torch.norm(lfinger_pos - rfinger_pos, dim=-1, p=2)
+    is_claw_closed = (finger_dist < 0.06).float()
+    claw_multiplier = 1.0 + (is_claw_closed * 9.0)
+
+    # ALL milestones require is_close. If it pulls the shelf, it gets exactly 0!
+    open_easy = (drawer_pos > 0.01) * 0.5 * is_close * claw_multiplier
+    open_medium = (drawer_pos > 0.2) * is_close * claw_multiplier
+    open_hard = (drawer_pos > 0.3) * is_close * claw_multiplier
+
+    return open_easy + open_medium + open_hard
+
+
+def conditional_action_rate_l2(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg) -> torch.Tensor:
+    """Action rate penalty that scales down as the drawer opens.
+    
+    If the drawer is completely closed, the penalty multiplier is 1.0.
+    As the drawer opens, the penalty multiplier scales down to 0.1 (so it's allowed to be slightly jerky while pulling).
+    """
+    # Base action rate penalty (squared difference of actions)
+    action_rate = torch.sum(torch.square(env.action_manager.action - env.action_manager.prev_action), dim=1)
+    
+    # Get drawer position
+    drawer_pos = env.scene[asset_cfg.name].data.joint_pos[:, asset_cfg.joint_ids[0]]
+    
+    # Scale multiplier from 1.0 (at 0m) down to 0.1 (at 0.35m)
+    multiplier = torch.clamp(1.0 - (drawer_pos / 0.35), min=0.1, max=1.0)
+    
+    return action_rate * multiplier
+
+
+def conditional_joint_vel_l2(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg, robot_cfg: SceneEntityCfg) -> torch.Tensor:
+    """Joint velocity penalty that scales down as the drawer opens."""
+    # Base joint velocity penalty
+    joint_vel = torch.sum(torch.square(env.scene[robot_cfg.name].data.joint_vel), dim=1)
+    
+    # Get drawer position
+    drawer_pos = env.scene[asset_cfg.name].data.joint_pos[:, asset_cfg.joint_ids[0]]
+    
+    # Scale multiplier from 1.0 (at 0m) down to 0.1 (at 0.35m)
+    multiplier = torch.clamp(1.0 - (drawer_pos / 0.35), min=0.1, max=1.0)
+    
+    return joint_vel * multiplier
+
+
+def print_stage_curriculum(env: ManagerBasedRLEnv, env_ids: torch.Tensor, term_name: str, weight: float, num_steps: int) -> float:
+    """A wrapper for modify_reward_weight that prints a message exactly when the stage transitions."""
+    from isaaclab.envs.mdp import modify_reward_weight
+    
+    # env.common_step_counter increments by the number of environments each step.
+    # We print a message when the step counter crosses the threshold.
+    if env.common_step_counter >= num_steps and env.common_step_counter < num_steps + env.num_envs * 2:
+        # Only print once (for the 0th environment) to avoid spamming the console
+        if len(env_ids) > 0 and env_ids[0] == 0:
+            print(f"\n=======================================================")
+            print(f"CURRICULUM UNLOCKED: STAGE 2")
+            print(f"=======================================================")
+            print(f"The AI has mastered reaching! Now forcing it to pull...")
+            print(f"Updating reward term '{term_name}' to {weight}")
+            print(f"=======================================================\n")
+            
+    return modify_reward_weight(env, env_ids, term_name, weight, num_steps)
+
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/find_offset.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/find_offset.py
new file mode 100644
index 00000000..c031a69a
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/find_offset.py
@@ -0,0 +1,46 @@
+import sys
+import argparse
+from isaaclab.app import AppLauncher
+
+# We have to launch Isaac Lab headlessly first, otherwise the Pixar library
+# doesn't know how to download files from NVIDIA's cloud servers!
+parser = argparse.ArgumentParser()
+AppLauncher.add_app_launcher_args(parser)
+args_cli, _ = parser.parse_known_args()
+args_cli.headless = True
+app_launcher = AppLauncher(args_cli)
+simulation_app = app_launcher.app
+
+from pxr import Usd, UsdGeom, Gf
+from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
+
+# The exact path used by Isaac Lab for the cabinet
+usd_path = f"{ISAAC_NUCLEUS_DIR}/Props/Sektion_Cabinet/sektion_cabinet_instanceable.usd"
+
+print(f"Downloading and opening USD: {usd_path}")
+print("(This may take a few seconds...)\n")
+
+stage = Usd.Stage.Open(usd_path)
+
+if not stage:
+    print("Could not open USD file!")
+    sys.exit(1)
+
+print("===== FOUND PRIMS RELATED TO THE TOP DRAWER =====")
+for prim in stage.Traverse():
+    name = prim.GetName().lower()
+    if "drawer" in name or "handle" in name:
+        # We only care about the top drawer stuff
+        if "bottom" not in name:
+            print(f"Path: {prim.GetPath()}")
+            
+            # Extract its mathematical location in the 3D file
+            xform = UsdGeom.Xformable(prim)
+            if xform:
+                time = Usd.TimeCode.Default()
+                transform = xform.ComputeLocalToWorldTransform(time)
+                translation = transform.ExtractTranslation()
+                print(f"   World Position (X, Y, Z): ({translation[0]:.4f}, {translation[1]:.4f}, {translation[2]:.4f})")
+            print("-" * 50)
+
+print("\nDONE!")
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/play.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/play.py
index d1444809..6485a6df 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/play.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/play.py
@@ -8,6 +8,17 @@
 """Launch Isaac Sim Simulator first."""
 
 import argparse
+import sys
+import os
+
+# Ensure HumanoidRLPackage is always importable regardless of working directory
+_script_dir = os.path.dirname(os.path.abspath(__file__))          # rsl_rl_scripts/
+_humanoid_rl_dir = os.path.dirname(_script_dir)                    # HumanoidRLPackage/
+_parent_dir = os.path.dirname(_humanoid_rl_dir)                    # HumanoidRL/
+if _parent_dir not in sys.path:
+    sys.path.insert(0, _parent_dir)
+if _script_dir not in sys.path:
+    sys.path.insert(0, _script_dir)  # so cli_args.py is also found
 
 from isaaclab.app import AppLauncher
 
@@ -74,7 +85,8 @@ def main():
     agent_cfg: RslRlOnPolicyRunnerCfg = cli_args.parse_rsl_rl_cfg(args_cli.task, args_cli)
 
     # specify directory for logging experiments
-    log_root_path = os.path.join("logs", "rsl_rl", agent_cfg.experiment_name)
+    repo_root = os.path.abspath(os.path.join(_script_dir, "../../../../../../"))
+    log_root_path = os.path.join(repo_root, "trained_models", agent_cfg.experiment_name)
     log_root_path = os.path.abspath(log_root_path)
     print(f"[INFO] Loading experiment from directory: {log_root_path}")
     if args_cli.use_pretrained_checkpoint:
@@ -99,13 +111,21 @@ def main():
 
     # wrap for video recording
     if args_cli.video:
+        import datetime
+        # Create a unique ID using a timestamp
+        unique_id = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
+        
+        # Point the video folder explicitly to the requested directory
+        target_video_dir = os.path.join(_humanoid_rl_dir, "HumanoidRLSetup", "tasks", "bimanual_cabinet", "videos")
+        
         video_kwargs = {
-            "video_folder": os.path.join(log_dir, "videos", "play"),
+            "video_folder": target_video_dir,
+            "name_prefix": f"demo_{unique_id}",
             "step_trigger": lambda step: step == 0,
             "video_length": args_cli.video_length,
             "disable_logger": True,
         }
-        print("[INFO] Recording videos during training.")
+        print(f"[INFO] Recording videos to: {target_video_dir}")
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
@@ -145,9 +165,13 @@ def main():
             # env stepping
             obs, _, _, _ = env.step(actions)
         if args_cli.video:
+            if timestep % 50 == 0:
+                print(f"[INFO] Recording video step {timestep}/{args_cli.video_length}...")
+            
             timestep += 1
             # Exit the play loop after recording one video
-            if timestep == args_cli.video_length:
+            if timestep >= args_cli.video_length:
+                print(f"[INFO] Finished recording {args_cli.video_length} steps! Video saved successfully.")
                 break
 
         # time delay for real-time evaluation
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/test_sim.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/test_sim.py
new file mode 100644
index 00000000..70292f9e
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/test_sim.py
@@ -0,0 +1,66 @@
+"""
+Simple script to boot up the Isaac Lab GUI with the Cabinet scene.
+Run this using: python autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/test_sim.py
+"""
+
+import argparse
+from isaaclab.app import AppLauncher
+
+# Launch the app
+parser = argparse.ArgumentParser(description="Test Isaac Lab GUI over VNC.")
+AppLauncher.add_app_launcher_args(parser)
+args_cli = parser.parse_args()
+
+# Force the app to open the GUI (turn off headless)
+args_cli.headless = False
+app_launcher = AppLauncher(args_cli)
+simulation_app = app_launcher.app
+
+import isaaclab.sim as sim_utils
+from isaaclab.assets import AssetBaseCfg
+from isaaclab.scene import InteractiveSceneCfg, InteractiveScene
+from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
+
+# Define a simple scene with just the ground and the cabinet
+class CabinetTestSceneCfg(InteractiveSceneCfg):
+    ground = AssetBaseCfg(
+        prim_path="/World/ground",
+        spawn=sim_utils.GroundPlaneCfg(),
+    )
+    
+    cabinet = AssetBaseCfg(
+        prim_path="/World/Cabinet",
+        spawn=sim_utils.UsdFileCfg(
+            usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Sektion_Cabinet/sektion_cabinet_instanceable.usd",
+            scale=(1.15, 1.15, 1.15),
+        ),
+        init_state=AssetBaseCfg.InitialStateCfg(
+            pos=(0.75, 0.0, 0.35),
+            rot=(0.0, 0.0, 0.0, 1.0),
+        )
+    )
+
+    light = AssetBaseCfg(
+        prim_path="/World/light",
+        spawn=sim_utils.DomeLightCfg(color=(0.75, 0.75, 0.75), intensity=2500.0),
+    )
+
+# Create the scene
+scene_cfg = CabinetTestSceneCfg(num_envs=1, env_spacing=2.0)
+scene = InteractiveScene(scene_cfg)
+
+# Play the simulation
+simulation_app.update()
+sim = sim_utils.SimulationContext()
+sim.play()
+
+print("==================================================================")
+print("Simulation is running! Try to look around using your mouse.")
+print("If the VNC crashes now, it means it cannot handle the RTX Renderer.")
+print("==================================================================")
+
+# Keep the window open
+while simulation_app.is_running():
+    sim.step()
+
+simulation_app.close()
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/train.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/train.py
index fc78b8b2..c737b47f 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/train.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/rsl_rl_scripts/train.py
@@ -9,6 +9,17 @@
 
 import argparse
 import sys
+import os
+
+# Ensure HumanoidRLPackage is always importable regardless of working directory
+_script_dir = os.path.dirname(os.path.abspath(__file__))          # rsl_rl_scripts/
+_humanoid_rl_dir = os.path.dirname(_script_dir)                    # HumanoidRLPackage/
+_parent_dir = os.path.dirname(_humanoid_rl_dir)                    # HumanoidRL/
+if _parent_dir not in sys.path:
+    sys.path.insert(0, _parent_dir)
+if _script_dir not in sys.path:
+    sys.path.insert(0, _script_dir)  # so cli_args.py is also found
+
 
 from isaaclab.app import AppLauncher
 
@@ -92,7 +103,9 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
     env_cfg.sim.device = args_cli.device if args_cli.device is not None else env_cfg.sim.device
 
     # specify directory for logging experiments
-    log_root_path = os.path.join("logs", "rsl_rl", agent_cfg.experiment_name)
+    # Save directly to trained_models folder in the repo root!
+    repo_root = os.path.abspath(os.path.join(_script_dir, "../../../../../../"))
+    log_root_path = os.path.join(repo_root, "trained_models", agent_cfg.experiment_name)
     log_root_path = os.path.abspath(log_root_path)
     print(f"[INFO] Logging experiment in directory: {log_root_path}")
     # specify directory for logging runs: {time-stamp}_{run_name}
@@ -117,13 +130,18 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
 
     # wrap for video recording
     if args_cli.video:
+        # Create a unique ID
+        unique_id = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
+        target_video_dir = os.path.join(_humanoid_rl_dir, "HumanoidRLSetup", "tasks", "bimanual_cabinet", "videos")
+        
         video_kwargs = {
-            "video_folder": os.path.join(log_dir, "videos", "train"),
+            "video_folder": target_video_dir,
+            "name_prefix": f"train_{unique_id}",
             "step_trigger": lambda step: step % args_cli.video_interval == 0,
             "video_length": args_cli.video_length,
             "disable_logger": True,
         }
-        print("[INFO] Recording videos during training.")
+        print(f"[INFO] Recording videos to {target_video_dir} during training.")
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/exported/policy.onnx b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/exported/policy.onnx
deleted file mode 100644
index 73d83eee..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/exported/policy.onnx and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/exported/policy.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/exported/policy.pt
deleted file mode 100644
index 17520a56..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/exported/policy.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/model_200.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/model_200.pt
deleted file mode 100644
index 1b208407..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/cabinet_humanoid_arm/2026-03-13_22-39-15/model_200.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/g1_flat/2026-05-31_22-53-49/model_1499.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/g1_flat/2026-05-31_22-53-49/model_1499.pt
deleted file mode 100644
index 61a4126b..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/g1_flat/2026-05-31_22-53-49/model_1499.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/lift_so101/2026-06-02_18-17-05/model_1200.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/lift_so101/2026-06-02_18-17-05/model_1200.pt
deleted file mode 100644
index 5aa70ae0..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/lift_so101/2026-06-02_18-17-05/model_1200.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/lift_so101/2026-06-02_18-45-56/model_400.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/lift_so101/2026-06-02_18-45-56/model_400.pt
deleted file mode 100644
index 6ada78e1..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/lift_so101/2026-06-02_18-45-56/model_400.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/exported/policy.onnx b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/exported/policy.onnx
deleted file mode 100644
index 350c18ff..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/exported/policy.onnx and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/exported/policy.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/exported/policy.pt
deleted file mode 100644
index 1949aaf2..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/exported/policy.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/model_299.pt b/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/model_299.pt
deleted file mode 100644
index 6804083c..00000000
Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/logs/rsl_rl/reach_humanoid_arm/2026-03-13_03-30-17/model_299.pt and /dev/null differ
diff --git a/autonomy/simulation/Humanoid_Wato/wato_bimanual_arm/urdf/armDouble.SLDASM.urdf b/autonomy/simulation/Humanoid_Wato/wato_bimanual_arm/urdf/armDouble.SLDASM.urdf
index 90801ca5..da7c2447 100644
--- a/autonomy/simulation/Humanoid_Wato/wato_bimanual_arm/urdf/armDouble.SLDASM.urdf
+++ b/autonomy/simulation/Humanoid_Wato/wato_bimanual_arm/urdf/armDouble.SLDASM.urdf
@@ -96,11 +96,7 @@
       link="link1" />
     <axis
       xyz="0 -1 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link2">
@@ -154,11 +150,7 @@
       link="link2" />
     <axis
       xyz="1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link3">
@@ -212,11 +204,7 @@
       link="link3" />
     <axis
       xyz="0 -1 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link4">
@@ -270,11 +258,7 @@
       link="link4" />
     <axis
       xyz="-1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link5">
@@ -328,11 +312,7 @@
       link="link5" />
     <axis
       xyz="0 -1 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link6">
@@ -386,11 +366,7 @@
       link="link6" />
     <axis
       xyz="1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link7">
@@ -444,11 +420,7 @@
       link="link7" />
     <axis
       xyz="-1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link8">
@@ -502,11 +474,7 @@
       link="link8" />
     <axis
       xyz="-1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link1L">
@@ -560,11 +528,7 @@
       link="link1L" />
     <axis
       xyz="0 1 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link2l">
@@ -618,11 +582,7 @@
       link="link2l" />
     <axis
       xyz="-1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link3l">
@@ -676,11 +636,7 @@
       link="link3l" />
     <axis
       xyz="0 1 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link4l">
@@ -734,11 +690,7 @@
       link="link4l" />
     <axis
       xyz="-1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link5l">
@@ -792,11 +744,7 @@
       link="link5l" />
     <axis
       xyz="0 1 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link6l">
@@ -850,11 +798,7 @@
       link="link6l" />
     <axis
       xyz="1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link7l">
@@ -908,11 +852,7 @@
       link="link7l" />
     <axis
       xyz="1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
   <link
     name="link8l">
@@ -966,10 +906,6 @@
       link="link8l" />
     <axis
       xyz="1 0 0" />
-    <limit
-      lower="0"
-      upper="0"
-      effort="0"
-      velocity="0" />
+    <limit lower="-3.1415" upper="3.1415" effort="100" velocity="10" />
   </joint>
 </robot>
\ No newline at end of file
diff --git a/autonomy/simulation/Teleop/keyboard-based teleoperation/joint_inspector.py b/autonomy/simulation/Teleop/keyboard-based teleoperation/joint_inspector.py
new file mode 100644
index 00000000..12b66797
--- /dev/null
+++ b/autonomy/simulation/Teleop/keyboard-based teleoperation/joint_inspector.py	
@@ -0,0 +1,344 @@
+"""
+joint_inspector.py
+==================
+Interactively inspect every joint (then every link) of the WATO bimanual arm.
+
+PHASE 1 — JOINTS
+  • The script sweeps the currently-selected joint back and forth with a
+    smooth sinusoidal motion so you can clearly see its range of motion.
+  • All other joints stay at their default (zero) position.
+  • Press  P  to advance to the next joint.
+  • After all joints have been visited, Phase 2 begins automatically.
+
+PHASE 2 — LINKS (body-frame visualisation)
+  • A coloured axis-frame marker is placed at each link's world position
+    so you can see exactly where each link lives in space.
+  • Press  P  to cycle the highlighted marker to the next link.
+  • Press  P  on the last link to exit.
+
+Run (non-headless so you can press P):
+    python joint_inspector.py
+"""
+
+# ── AppLauncher must be the very first Isaac import ──────────────────────────
+import argparse
+import os
+import sys
+
+from isaaclab.app import AppLauncher
+
+parser = argparse.ArgumentParser(description="Joint & Link Inspector")
+AppLauncher.add_app_launcher_args(parser)
+args_cli = parser.parse_args()
+args_cli.headless = False          # GUI is required to press P
+
+app_launcher = AppLauncher(args_cli)
+simulation_app = app_launcher.app
+
+# ── All other imports AFTER SimulationApp is alive ───────────────────────────
+import math
+import torch
+import numpy as np
+
+import isaaclab.sim as sim_utils
+from isaaclab.assets import AssetBaseCfg
+from isaaclab.assets.articulation import ArticulationCfg
+from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.managers import SceneEntityCfg
+from isaaclab.markers import VisualizationMarkers
+from isaaclab.markers.config import FRAME_MARKER_CFG
+from isaaclab.scene import InteractiveScene, InteractiveSceneCfg
+from isaaclab.utils import configclass
+
+# omni.ui is available only in GUI mode
+import omni.ui as ui
+
+# ── WATO bimanual arm config (same URDF as task_space_test) ──────────────────
+_BIMANUAL_URDF_PATH = os.path.abspath(
+    os.path.join(
+        os.path.dirname(__file__),
+        "..", "..",
+        "Humanoid_Wato", "wato_bimanual_arm", "urdf", "armDouble.SLDASM.urdf",
+    )
+)
+
+WATO_BIMANUAL_ARM_CFG = ArticulationCfg(
+    spawn=sim_utils.UrdfFileCfg(
+        asset_path=_BIMANUAL_URDF_PATH,
+        fix_base=True,
+        joint_drive=sim_utils.UrdfFileCfg.JointDriveCfg(
+            drive_type="force",
+            target_type="position",
+            gains=sim_utils.UrdfFileCfg.JointDriveCfg.PDGainsCfg(
+                stiffness=400.0,
+                damping=40.0,
+            ),
+        ),
+        rigid_props=sim_utils.RigidBodyPropertiesCfg(
+            disable_gravity=False,
+            max_depenetration_velocity=5.0,
+        ),
+        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
+            enabled_self_collisions=False,
+        ),
+    ),
+    init_state=ArticulationCfg.InitialStateCfg(
+        pos=(0.0, 0.0, 0.0),
+        joint_pos={
+            "joint1": 0.0, "joint2": 0.0, "joint3": 0.0,
+            "joint4": 0.0, "joint5": 0.0, "joint6": 0.0,
+            "joint7": 0.0, "joint8": 0.0,
+            "joint1L": 0.0, "joint2l": 0.0, "joint3l": 0.0,
+            "joint4l": 0.0, "joint5l": 0.0, "joint6l": 0.0,
+            "joint7l": 0.0, "joint8l": 0.0,
+        },
+    ),
+    actuators={
+        "right_arm": ImplicitActuatorCfg(
+            joint_names_expr=["joint[1-6]"],
+            stiffness=400.0, damping=40.0, velocity_limit_sim=3.0,
+        ),
+        "right_gripper": ImplicitActuatorCfg(
+            joint_names_expr=["joint[7-8]"],
+            stiffness=200.0, damping=20.0, velocity_limit_sim=1.0,
+        ),
+        "left_arm": ImplicitActuatorCfg(
+            joint_names_expr=["joint[1-6]L", "joint[2-6]l"],
+            stiffness=400.0, damping=40.0, velocity_limit_sim=3.0,
+        ),
+        "left_gripper": ImplicitActuatorCfg(
+            joint_names_expr=["joint[7-8]l"],
+            stiffness=200.0, damping=20.0, velocity_limit_sim=1.0,
+        ),
+    },
+)
+
+
+# ── Scene config ─────────────────────────────────────────────────────────────
+@configclass
+class InspectorSceneCfg(InteractiveSceneCfg):
+    ground = AssetBaseCfg(
+        prim_path="/World/defaultGroundPlane",
+        spawn=sim_utils.GroundPlaneCfg(),
+        init_state=AssetBaseCfg.InitialStateCfg(pos=(0.0, 0.0, -1.05)),
+    )
+    dome_light = AssetBaseCfg(
+        prim_path="/World/Light",
+        spawn=sim_utils.DomeLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75)),
+    )
+    robot = WATO_BIMANUAL_ARM_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
+
+
+# ── Sweep amplitude (radians) per joint ──────────────────────────────────────
+# Tune these if a joint hits a hard limit and looks wrong.
+# Format: { joint_name_substring : amplitude_rad }
+JOINT_AMPLITUDES = {
+    "joint1":  0.8,
+    "joint2":  0.8,
+    "joint3":  0.8,
+    "joint4":  0.8,
+    "joint5":  0.8,
+    "joint6":  0.8,
+    "joint7":  0.03,   # gripper finger (metres or radians depending on URDF)
+    "joint8":  0.03,
+    "joint1L": 0.8,
+    "joint2l": 0.8,
+    "joint3l": 0.8,
+    "joint4l": 0.8,
+    "joint5l": 0.8,
+    "joint6l": 0.8,
+    "joint7l": 0.03,
+    "joint8l": 0.03,
+}
+DEFAULT_AMPLITUDE = 0.6          # fallback if joint name not in the dict above
+SWEEP_FREQUENCY_HZ = 0.4        # full back-and-forth cycles per second
+
+
+# ─────────────────────────────────────────────────────────────────────────────
+def run_simulator(sim: sim_utils.SimulationContext, scene: InteractiveScene):
+    robot = scene["robot"]
+
+    # ── Resolve all joints ───────────────────────────────────────────────────
+    all_joints_cfg = SceneEntityCfg("robot", joint_names=[".*"], body_names=[".*"])
+    all_joints_cfg.resolve(scene)
+
+    all_joint_names  = robot.data.joint_names          # list[str]  length = N
+    all_body_names   = robot.data.body_names           # list[str]  length = M
+    num_joints       = len(all_joint_names)
+    num_bodies       = len(all_body_names)
+
+    print("\n" + "=" * 60)
+    print(f"  Found {num_joints} joints and {num_bodies} links (bodies)")
+    print("=" * 60)
+    for i, n in enumerate(all_joint_names):
+        print(f"  joint [{i:02d}]  {n}")
+    print("-" * 60)
+    for i, n in enumerate(all_body_names):
+        print(f"  body  [{i:02d}]  {n}")
+    print("=" * 60 + "\n")
+
+    # ── Visualisation markers (one axis-frame per body, shown in link phase) ─
+    marker_cfg = FRAME_MARKER_CFG.copy()
+    marker_cfg.markers["frame"].scale = (0.08, 0.08, 0.08)
+    link_markers = VisualizationMarkers(
+        marker_cfg.replace(prim_path="/Visuals/LinkMarkers")
+    )
+
+    # ── Inspector state machine ───────────────────────────────────────────────
+    # phase = "joints"  →  we sweep joint[current_idx] sinusoidally
+    # phase = "links"   →  we freeze all joints, highlight body[current_idx]
+    state = {
+        "phase":       "joints",
+        "current_idx": 0,
+        "advance":     False,   # set True by the P-key callback
+    }
+
+    # ── P-key callback ────────────────────────────────────────────────────────
+    def _on_press_p():
+        state["advance"] = True
+
+    # Minimal UI window with a "Next (P)" button (pressing P on keyboard also works)
+    ctrl_window = ui.Window("Inspector Controls", width=260, height=110)
+    with ctrl_window.frame:
+        with ui.VStack(spacing=8):
+            ui.Label("Press  P  (keyboard) or click below", alignment=ui.Alignment.CENTER)
+            ui.Button("▶  Next joint / link  (P)", clicked_fn=_on_press_p, height=44)
+
+    # Keyboard hook for  P
+    try:
+        from isaaclab.devices import Se3Keyboard  # use its internal carb hook
+        # We only want the P key — wrap with a tiny carb input listener instead
+        import carb.input as carb_input
+        import omni.appwindow
+
+        _app_window   = omni.appwindow.get_default_app_window()
+        _keyboard     = _app_window.get_keyboard()
+        _input_iface  = carb_input.acquire_input_interface()
+
+        def _keyboard_cb(event, *args, **kwargs):
+            if (event.type == carb_input.KeyboardEventType.KEY_PRESS and
+                    event.input == carb_input.KeyboardInput.P):
+                _on_press_p()
+            return True
+
+        _sub = _input_iface.subscribe_to_keyboard_events(_keyboard, _keyboard_cb)
+        print("[INFO] Keyboard listener attached — press  P  in the viewport to advance.")
+    except Exception as exc:
+        print(f"[WARN] Could not attach keyboard listener: {exc}")
+        print("[INFO] Use the UI button to advance instead.")
+
+    # ── Helpers ───────────────────────────────────────────────────────────────
+    sim_dt  = sim.get_physics_dt()
+    elapsed = 0.0            # seconds since phase/joint began
+
+    def _get_amplitude(joint_name: str) -> float:
+        for key, amp in JOINT_AMPLITUDES.items():
+            if key.lower() in joint_name.lower():
+                return amp
+        return DEFAULT_AMPLITUDE
+
+    def _print_current():
+        if state["phase"] == "joints":
+            idx = state["current_idx"]
+            amp = _get_amplitude(all_joint_names[idx])
+            print(f"\n[INSPECTOR] ── JOINT PHASE  [{idx+1}/{num_joints}] ──")
+            print(f"             Sweeping joint  '{all_joint_names[idx]}'")
+            print(f"             Amplitude : ±{amp:.3f} rad   Freq: {SWEEP_FREQUENCY_HZ} Hz")
+            print(f"             Press  P  to move to the next joint.\n")
+        else:
+            idx = state["current_idx"]
+            print(f"\n[INSPECTOR] ── LINK PHASE  [{idx+1}/{num_bodies}] ──")
+            print(f"             Highlighting body  '{all_body_names[idx]}'")
+            print(f"             (axis-frame marker shown at link origin)")
+            print(f"             Press  P  to move to the next link.\n")
+
+    _print_current()
+
+    # ── Main loop ─────────────────────────────────────────────────────────────
+    while simulation_app.is_running():
+
+        # ── Handle advance request ─────────────────────────────────────────
+        if state["advance"]:
+            state["advance"] = False
+            elapsed          = 0.0
+
+            if state["phase"] == "joints":
+                # Reset arm to default before moving to next
+                reset_pos = robot.data.default_joint_pos.clone()
+                reset_vel = robot.data.default_joint_vel.clone()
+                robot.write_joint_state_to_sim(reset_pos, reset_vel)
+
+                state["current_idx"] += 1
+                if state["current_idx"] >= num_joints:
+                    # All joints done → switch to link phase
+                    state["phase"]       = "links"
+                    state["current_idx"] = 0
+                    print("\n[INSPECTOR] ══ All joints inspected!  Starting LINK phase. ══\n")
+            else:
+                state["current_idx"] += 1
+                if state["current_idx"] >= num_bodies:
+                    print("\n[INSPECTOR] ══ All links inspected!  Closing simulation. ══\n")
+                    break
+
+            _print_current()
+
+        # ── JOINT PHASE: sinusoidal sweep of the current joint ─────────────
+        if state["phase"] == "joints":
+            joint_idx  = state["current_idx"]
+            joint_name = all_joint_names[joint_idx]
+            amplitude  = _get_amplitude(joint_name)
+
+            # Build target = all zeros (default) except the active joint
+            target = robot.data.default_joint_pos.clone()   # shape [1, N]
+            angle  = amplitude * math.sin(2.0 * math.pi * SWEEP_FREQUENCY_HZ * elapsed)
+            target[0, joint_idx] = angle
+
+            robot.set_joint_position_target(target)
+
+        # ── LINK PHASE: freeze joints, show marker at the selected body ────
+        else:
+            # Keep joints at default (already reset when we entered this phase)
+            target = robot.data.default_joint_pos.clone()
+            robot.set_joint_position_target(target)
+
+            # Build marker positions: dim all to near-zero, highlight current
+            body_idx    = state["current_idx"]
+            body_states = robot.data.body_state_w   # [1, M, 13]
+
+            positions  = body_states[0, :, 0:3].clone()   # [M, 3]
+            quats      = body_states[0, :, 3:7].clone()   # [M, 4]
+
+            # Repeat shapes to feed into visualizer (needs [N,3] and [N,4])
+            link_markers.visualize(positions, quats)
+
+            # Print the current link's world position once per 120 steps (~1.2 s)
+            if int(elapsed / sim_dt) % 120 == 0:
+                pos = body_states[0, body_idx, 0:3].cpu().numpy()
+                print(f"[LINK pos]  '{all_body_names[body_idx]}'  →  "
+                      f"x={pos[0]:.4f}  y={pos[1]:.4f}  z={pos[2]:.4f}")
+
+        # ── Physics step ──────────────────────────────────────────────────
+        scene.write_data_to_sim()
+        sim.step()
+        scene.update(sim_dt)
+        elapsed += sim_dt
+
+
+# ─────────────────────────────────────────────────────────────────────────────
+def main():
+    sim_cfg = sim_utils.SimulationCfg(dt=0.01, device=args_cli.device)
+    sim     = sim_utils.SimulationContext(sim_cfg)
+    sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
+
+    scene_cfg = InspectorSceneCfg(num_envs=1, env_spacing=2.0)
+    scene     = InteractiveScene(scene_cfg)
+
+    sim.reset()
+    print("[INFO]: Setup complete — starting Joint Inspector...")
+
+    run_simulator(sim, scene)
+
+
+if __name__ == "__main__":
+    main()
+    simulation_app.close()
diff --git a/autonomy/simulation/Teleop/keyboard-based teleoperation/task_space_test.py b/autonomy/simulation/Teleop/keyboard-based teleoperation/task_space_test.py
new file mode 100644
index 00000000..1485ff3e
--- /dev/null
+++ b/autonomy/simulation/Teleop/keyboard-based teleoperation/task_space_test.py	
@@ -0,0 +1,1059 @@
+import argparse
+import os
+from isaaclab.app import AppLauncher
+import h5py
+import numpy as np
+from pathlib import Path
+from datetime import datetime
+from isaaclab.utils.math import subtract_frame_transforms, quat_mul, quat_from_euler_xyz
+
+"""Robot Arm Teleoperation (headless-compatible) with Task Space IK Control"""
+
+parser = argparse.ArgumentParser(description="Robot Arm Teleoperation with Task Space IK Control")
+# AppLauncher adds --enable_cameras, so we don't need to add it manually here
+AppLauncher.add_app_launcher_args(parser)
+args_cli = parser.parse_args()
+
+# Force enable cameras if not set, as they are required for sensors
+if not args_cli.enable_cameras:
+    args_cli.enable_cameras = True
+
+app_launcher = AppLauncher(args_cli)
+simulation_app = app_launcher.app
+
+import torch
+
+import isaaclab.sim as sim_utils
+from isaaclab.assets import AssetBaseCfg
+from isaaclab.controllers import DifferentialIKController, DifferentialIKControllerCfg
+from isaaclab.managers import SceneEntityCfg
+from isaaclab.markers import VisualizationMarkers
+from isaaclab.markers.config import FRAME_MARKER_CFG
+from isaaclab.scene import InteractiveScene, InteractiveSceneCfg
+from isaaclab.sensors import Camera, CameraCfg
+from isaaclab.utils import configclass
+from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
+from isaaclab.utils.math import subtract_frame_transforms
+
+# Monkey-patch warp.types.array to fix Isaac Sim compatibility issue
+# The installed version of warp doesn't accept 'owner' arg, but Isaac Sim passes it
+try:
+    import warp as wp
+    if hasattr(wp.types, 'array'):
+        _original_warp_array = wp.types.array
+        def _patched_warp_array(*args, **kwargs):
+            if 'owner' in kwargs:
+                # Remove the incompatible argument
+                del kwargs['owner']
+            return _original_warp_array(*args, **kwargs)
+        wp.types.array = _patched_warp_array
+        print("[INFO] Applied Warp array monkey-patch for compatibility")
+except ImportError:
+    pass
+
+from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.assets.articulation import ArticulationCfg
+
+# ---- WATO Bimanual Arm Config (loaded from URDF) ----
+_BIMANUAL_URDF_PATH = os.path.join(
+    os.path.dirname(__file__),
+    "..", "..",
+    "Humanoid_Wato", "wato_bimanual_arm", "urdf", "armDouble.SLDASM.urdf",
+)
+_BIMANUAL_URDF_PATH = os.path.abspath(_BIMANUAL_URDF_PATH)
+
+# Right arm: joints joint1-joint6 (arm), joint7/8 (gripper fingers)
+# Left arm:  joints joint1L-joint6l (arm), joint7l/8l (gripper fingers)
+# EE for right arm = link6  |  EE for left arm = link6l
+WATO_BIMANUAL_ARM_CFG = ArticulationCfg(
+    spawn=sim_utils.UrdfFileCfg(
+        asset_path=_BIMANUAL_URDF_PATH,
+        fix_base=True,
+        joint_drive=sim_utils.UrdfFileCfg.JointDriveCfg(
+            drive_type="force",
+            target_type="position",
+            gains=sim_utils.UrdfFileCfg.JointDriveCfg.PDGainsCfg(
+                stiffness=400.0,
+                damping=40.0,
+            ),
+        ),
+        rigid_props=sim_utils.RigidBodyPropertiesCfg(
+            disable_gravity=False,
+            max_depenetration_velocity=5.0,
+        ),
+        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
+            enabled_self_collisions=False,
+        ),
+    ),
+    init_state=ArticulationCfg.InitialStateCfg(
+        pos=(0.0, 0.0, 0.0),
+        joint_pos={
+            # Right arm — start with shoulder slightly raised so arm
+            # faces forward (not fully sideways along +Y)
+            "joint1": 0.0,
+            "joint2": -0.78,   # ~-45° : lifts arm to a natural forward pose
+            "joint3": 0.0,
+            "joint4": 1.2,    # ~+69° : elbow bent so EE is in front of robot
+            "joint5": 0.0,
+            "joint6": 0.0,
+            "joint7": 0.0,
+            "joint8": 0.0,
+            # Left arm
+            "joint1L": 0.0,
+            "joint2l": 0.0,
+            "joint3l": 0.0,
+            "joint4l": 0.0,
+            "joint5l": 0.0,
+            "joint6l": 0.0,
+            "joint7l": 0.0,
+            "joint8l": 0.0,
+        },
+    ),
+    actuators={
+        "right_arm": ImplicitActuatorCfg(
+            joint_names_expr=["joint[1-6]"],
+            stiffness=400.0,
+            damping=40.0,
+            velocity_limit_sim=3.0,
+        ),
+        "right_gripper": ImplicitActuatorCfg(
+            joint_names_expr=["joint[7-8]"],
+            stiffness=200.0,
+            damping=20.0,
+            velocity_limit_sim=1.0,
+        ),
+        "left_arm": ImplicitActuatorCfg(
+            joint_names_expr=["joint[1-6]L", "joint[2-6]l"],
+            stiffness=400.0,
+            damping=40.0,
+            velocity_limit_sim=3.0,
+        ),
+        "left_gripper": ImplicitActuatorCfg(
+            joint_names_expr=["joint[7-8]l"],
+            stiffness=200.0,
+            damping=20.0,
+            velocity_limit_sim=1.0,
+        ),
+    },
+)
+
+# Video recording imports
+try:
+    import cv2
+    CV2_AVAILABLE = True
+except ImportError:
+    CV2_AVAILABLE = False
+    print("[WARN] OpenCV not available, video recording will use imageio instead")
+
+try:
+    import imageio
+    IMAGEIO_AVAILABLE = True
+except ImportError:
+    IMAGEIO_AVAILABLE = False
+
+# Import Se3Keyboard and Omni UI only if GUI mode
+if not args_cli.headless:
+    from isaaclab.devices import Se3Keyboard
+    import omni.ui as ui
+
+
+# Add recording functionality
+import h5py
+import numpy as np
+from pathlib import Path
+
+class DemonstrationRecorder:
+    """
+    Records robot demonstrations with timestamped filenames.
+    Each session creates a new file: robot_demos_YYYYMMDD_HHMMSS.hdf5
+    Also saves video recordings to the 'recordings' folder.
+    """
+    
+    def __init__(self, save_dir="demonstrations", recordings_dir="recordings"):
+        # Create save directories if they don't exist
+        self.save_dir = Path(save_dir)
+        self.save_dir.mkdir(exist_ok=True, parents=True)
+        
+        self.recordings_dir = Path(recordings_dir)
+        self.recordings_dir.mkdir(exist_ok=True, parents=True)
+        
+        self.episodes = []
+        
+        # Determine starting episode number by scanning existing recordings
+        self.episode_counter = 0
+        if self.recordings_dir.exists():
+            import re
+            pattern = re.compile(r"episode_(\d+)_")
+            max_ep = -1
+            for file_path in self.recordings_dir.glob("*.mp4"):
+                match = pattern.search(file_path.name)
+                if match:
+                    try:
+                        ep_num = int(match.group(1))
+                        if ep_num > max_ep:
+                            max_ep = ep_num
+                    except ValueError:
+                        pass
+            if max_ep >= 0:
+                self.episode_counter = max_ep + 1
+                print(f"[INFO] Found existing recordings up to episode_{max_ep}. Resuming recording from episode_{self.episode_counter}")
+        
+        self.current_episode = {
+            'episode_num': 0,
+            'observations': [],
+            'actions': [],
+            'ee_poses': [],
+            'joint_positions': [],
+            'video_frames': {}  # Dictionary of lists {cam_name: [frames]}
+        }
+        self.recording = False
+        self.video_fps = 10  # Frames per second for saved videos
+        
+        print(f"[INFO] HDF5 and MP4 files will save per-episode to: {self.save_dir} and {self.recordings_dir}")
+    
+    def start_episode(self):
+        """Start recording a new episode"""
+        self.recording = True
+        self.current_episode = {
+            'episode_num': self.episode_counter,
+            'observations': [],
+            'actions': [],
+            'ee_poses': [],
+            'joint_positions': [],
+            'video_frames': {}  # Dictionary of lists
+        }
+        print(f"[RECORDING] Started episode (will be episode_{self.episode_counter})")
+    
+    def add_transition(self, obs, action, ee_pose, joint_pos, video_frames_dict=None):
+        """Add a single transition to the current episode
+        
+        Args:
+            video_frames_dict: Dictionary mapping camera names to frame data
+        """
+        if self.recording:
+            self.current_episode['observations'].append(obs.cpu().numpy())
+            self.current_episode['actions'].append(action.cpu().numpy())
+            self.current_episode['ee_poses'].append(ee_pose.cpu().numpy())
+            self.current_episode['joint_positions'].append(joint_pos.cpu().numpy())
+            
+            # Store video frames if provided
+            if video_frames_dict is not None and isinstance(video_frames_dict, dict):
+                for cam_name, frame_data in video_frames_dict.items():
+                    if cam_name not in self.current_episode['video_frames']:
+                        self.current_episode['video_frames'][cam_name] = []
+                    
+                    # Handle frame data conversion if needed (though simulation loop usually handles this now)
+                    if hasattr(frame_data, 'cpu'):
+                        frame = frame_data.cpu().numpy()
+                    elif hasattr(frame_data, 'numpy'):
+                        frame = frame_data.numpy()
+                    elif isinstance(frame_data, list):
+                        frame = np.array(frame_data)
+                    else:
+                        frame = frame_data
+                        
+                    self.current_episode['video_frames'][cam_name].append(frame)
+    
+    def _save_video(self, frames, episode_num, cam_suffix=""):
+        """Save video frames to MP4 file"""
+        if len(frames) == 0:
+            return None
+        
+        # Add camera suffix if provided
+        name_part = f"episode_{episode_num}"
+        if cam_suffix:
+            name_part += f"_{cam_suffix}"
+            
+        video_filename = f"{name_part}.mp4"
+        video_path = self.recordings_dir / video_filename
+        
+        try:
+            if CV2_AVAILABLE:
+                # Use OpenCV for video writing
+                height, width = frames[0].shape[:2]
+                fourcc = cv2.VideoWriter_fourcc(*'mp4v')
+                out = cv2.VideoWriter(str(video_path), fourcc, self.video_fps, (width, height))
+                for frame in frames:
+                    # Convert RGB to BGR for OpenCV
+                    if len(frame.shape) == 3 and frame.shape[2] == 3:
+                        frame_bgr = cv2.cvtColor(frame.astype(np.uint8), cv2.COLOR_RGB2BGR)
+                    else:
+                        frame_bgr = frame.astype(np.uint8)
+                    out.write(frame_bgr)
+                out.release()
+                print(f"[VIDEO] Saved {len(frames)} frames to {video_path}")
+            elif IMAGEIO_AVAILABLE:
+                # Use imageio for video writing
+                imageio.mimwrite(str(video_path), [f.astype(np.uint8) for f in frames], fps=self.video_fps)
+                print(f"[VIDEO] Saved {len(frames)} frames to {video_path}")
+            else:
+                print("[WARN] No video library available (cv2 or imageio). Video not saved.")
+                return None
+            return str(video_path)
+        except Exception as e:
+            print(f"[ERROR] Failed to save video: {e}")
+            return None
+    
+    def end_episode(self):
+        """End the current episode, save video, and store episode data"""
+        if self.recording and len(self.current_episode['observations']) > 0:
+            episode_num = self.current_episode['episode_num']
+            num_steps = len(self.current_episode['observations'])
+            
+            # Save videos for all cameras
+            video_paths = {}
+            total_frames = 0
+            
+            for cam_name, frames in self.current_episode['video_frames'].items():
+                if len(frames) > 0:
+                    path = self._save_video(frames, episode_num, cam_suffix=cam_name)
+                    if path:
+                        video_paths[cam_name] = path
+                        total_frames += len(frames)
+            
+            # Store video path reference and clear frames from memory
+            self.current_episode['video_paths'] = video_paths
+            self.current_episode['video_frames'] = {}  # Clear frames to save memory
+            
+            self.episodes.append(self.current_episode)
+            self.episode_counter += 1  # Increment global counter
+            print(f"[RECORDING] Episode {episode_num} completed with {num_steps} steps. Saved {len(video_paths)} videos.")
+            
+            # Automatically save the HDF5 file alongside the MP4s
+            self.save()
+            self.get_quick_summary()
+        elif self.recording:
+            print("[WARN] Episode ended but no data was recorded")
+        self.recording = False
+    
+    def save(self):
+        """Save each episode from this session to its own HDF5 file"""
+        if len(self.episodes) == 0:
+            print("[WARN] No episodes to save")
+            return
+        
+        saved_count = 0
+        try:
+            for episode in self.episodes:
+                episode_num = episode.get('episode_num', 0)
+                file_path = self.save_dir / f"robot_demos_{episode_num}.hdf5"
+                
+                # Use 'w' mode as each episode gets its own new file
+                with h5py.File(file_path, 'w') as f:
+                    grp = f.create_group(f'episode_{episode_num}')
+                    for key, value in episode.items():
+                        # Skip video_frames (already saved as MP4) and internal metadata
+                        if key in ('video_frames', 'episode_num'):
+                            continue
+                        elif key == 'video_path':
+                             grp.attrs['video_path'] = value if value else ""
+                        elif key == 'video_paths':
+                            # Store paths for all cameras
+                            for cam_name, path in value.items():
+                                grp.attrs[f'video_path_{cam_name}'] = str(path)
+                        else:
+                            grp.create_dataset(key, data=np.array(value))
+                saved_count += 1
+            
+            print(f"[SUCCESS] Saved {saved_count} episodes individual files to {self.save_dir.absolute()}")
+            
+            # Clear episodes from memory after successful save
+            self.episodes = []
+            
+        except Exception as e:
+            print(f"[ERROR] Failed to save demonstrations: {e}")
+    
+    def get_stats(self, detailed=False):
+        """Print comprehensive statistics about recorded demonstrations."""
+        print("\n" + "="*60)
+        print("DEMONSTRATION STATISTICS")
+        print("="*60)
+        
+        print(f"\n📁 Saving format: 1 file per episode (e.g., robot_demos_0.hdf5)")
+        
+        # Episodes in memory
+        episodes_in_memory = [len(ep['observations']) for ep in self.episodes]
+        total_in_memory = len(self.episodes)
+        
+        print(f"\n📊 This Session:")
+        print(f"   Episodes recorded: {total_in_memory}")
+        
+        if episodes_in_memory:
+            total_timesteps = sum(episodes_in_memory)
+            avg_length = np.mean(episodes_in_memory)
+            min_length = min(episodes_in_memory)
+            max_length = max(episodes_in_memory)
+            
+            print(f"\n📈 Episode Statistics:")
+            print(f"   Total timesteps: {total_timesteps:,}")
+            print(f"   Average: {avg_length:.1f} steps")
+            print(f"   Min: {min_length} steps")
+            print(f"   Max: {max_length} steps")
+            
+            if detailed:
+                print(f"\n📋 Per-Episode Breakdown:")
+                for i, steps in enumerate(episodes_in_memory):
+                    print(f"   episode_{i} in memory: {steps} steps (unsaved)")
+        
+        # Show all files in directory
+        if self.save_dir.exists():
+            all_files = sorted(self.save_dir.glob("robot_demos_*.hdf5"))
+            if all_files:
+                print(f"\n📂 All Sessions in {self.save_dir}:")
+                total_all_episodes = 0
+                for file_path in all_files:
+                    try:
+                        with h5py.File(file_path, 'r') as f:
+                            num_eps = len(f.keys())
+                            total_all_episodes += num_eps
+                            print(f"   {file_path.name}: {num_eps} episodes")
+                    except:
+                        print(f"   {file_path.name}: Error reading")
+                print(f"\n   Total episodes across all sessions: {total_all_episodes}")
+        
+        print(f"\n🎯 Training Readiness:")
+        if total_in_memory > 0:
+            print(f"   ⚠ WARNING: {total_in_memory} unsaved episodes!")
+            print(f"   → Click 'Save All Demos' to save them")
+        else:
+            print(f"   ✓ All episodes saved")
+        
+        print("="*60 + "\n")
+    
+    def get_quick_summary(self):
+        """Print a quick one-line summary"""
+        unsaved = len(self.episodes)
+        status = "✓" if unsaved == 0 else f"⚠ {unsaved} unsaved"
+        print(f"[STATS] This session: {len(self.episodes)} episodes {status}")
+
+def _get_cam_quat(angle_deg):
+    try:
+        angle_rad = np.deg2rad(angle_deg)
+        q = quat_from_euler_xyz(
+            torch.tensor([0.0]),
+            torch.tensor([np.deg2rad(30.0)]),
+            torch.tensor([angle_rad + np.pi])
+        )
+        return tuple(q[0].tolist())
+    except:
+        return (1.0, 0.0, 0.0, 0.0)
+
+@configclass
+class TableTopSceneCfg(InteractiveSceneCfg):
+    """Configuration for a simple tabletop scene with a robot."""
+
+    ground = AssetBaseCfg(
+        prim_path="/World/defaultGroundPlane",
+        spawn=sim_utils.GroundPlaneCfg(),
+        init_state=AssetBaseCfg.InitialStateCfg(pos=(0.0, 0.0, -1.05)),
+    )
+
+    dome_light = AssetBaseCfg(
+        prim_path="/World/Light",
+        spawn=sim_utils.DomeLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
+    )
+
+    table = AssetBaseCfg(
+        prim_path="{ENV_REGEX_NS}/Table",
+        spawn=sim_utils.UsdFileCfg(
+            usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Mounts/Stand/stand_instanceable.usd", scale=(2.0, 2.0, 2.0)
+        ),
+    )
+
+    # Elevated platform/table for the pickable cube
+    cube_table = AssetBaseCfg(
+        prim_path="/World/CubeTable",
+        spawn=sim_utils.CuboidCfg(
+            size=[0.4, 0.4, 0.6],  # 40cm x 40cm surface, 60cm tall
+            rigid_props=sim_utils.RigidBodyPropertiesCfg(
+                disable_gravity=True,
+                kinematic_enabled=True,
+            ),
+            collision_props=sim_utils.CollisionPropertiesCfg(),
+            visual_material=sim_utils.PreviewSurfaceCfg(
+                diffuse_color=(0.8, 0.6, 0.4),  # Wood-like color
+            ),
+        ),
+        init_state=AssetBaseCfg.InitialStateCfg(pos=(0.8, 0, -0.1)),  # Moved away from origin, base at ground
+    )
+
+    # Original cube
+    cube = AssetBaseCfg(
+        prim_path="/World/cube",
+        spawn=sim_utils.CuboidCfg(size=[0.1, 0.1, 0.1]),
+        init_state=AssetBaseCfg.InitialStateCfg(pos=(0.5, 0.0, 0.5)),
+    )
+
+    # Second pickable cube - now on elevated table
+    cube2 = AssetBaseCfg(
+        prim_path="/World/cube2",
+        spawn=sim_utils.CuboidCfg(
+            size=[0.03, 0.03, 0.03],
+            rigid_props=sim_utils.RigidBodyPropertiesCfg(
+                disable_gravity=False,
+                max_depenetration_velocity=1.0,
+            ),
+            mass_props=sim_utils.MassPropertiesCfg(mass=0.2),
+            collision_props=sim_utils.CollisionPropertiesCfg(),
+            visual_material=sim_utils.PreviewSurfaceCfg(
+                diffuse_color=(0.2, 0.6, 0.9),  # Blue color for visibility
+            ),
+        ),
+        init_state=AssetBaseCfg.InitialStateCfg(pos=(0.8, 0, 0.23)),  # On top of table, matching table position
+    )
+
+    # WATO bimanual arm (right arm controlled, left arm locked at zero)
+    robot = WATO_BIMANUAL_ARM_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
+    
+    # Main Camera (Camera 1 position - user's preferred angle)
+    # Position: 36 degrees around circle, radius 1.5m from (0.5, 0), height 1.0m
+    camera = CameraCfg(
+        prim_path="{ENV_REGEX_NS}/Camera",
+        update_period=0.1,  # 10 Hz
+        height=480,
+        width=640,
+        data_types=["rgb"],
+        spawn=sim_utils.PinholeCameraCfg(
+            focal_length=24.0,
+            focus_distance=400.0,
+            horizontal_aperture=20.955,
+            clipping_range=(0.1, 20.0),
+        ),
+        offset=CameraCfg.OffsetCfg(
+            pos=(1.71, 0.88, 1.0),  # Camera 1 position (36 deg)
+            rot=_get_cam_quat(36.0),
+            convention="world",
+        ),
+    )
+
+    # Camera 3 (108 degrees - left side view)
+    camera_3 = CameraCfg(
+        prim_path="{ENV_REGEX_NS}/Camera_3",
+        update_period=0.1,
+        height=480,
+        width=640,
+        data_types=["rgb"],
+        spawn=sim_utils.PinholeCameraCfg(
+            focal_length=24.0,
+            focus_distance=400.0,
+            horizontal_aperture=20.955,
+            clipping_range=(0.1, 20.0),
+        ),
+        offset=CameraCfg.OffsetCfg(
+            pos=(0.04, 1.43, 1.0),  # Camera 3 position (108 deg)
+            rot=_get_cam_quat(108.0),
+            convention="world",
+        ),
+    )
+
+    # Camera 9 (324 degrees - right side view, opposite of Camera 3)
+    camera_9 = CameraCfg(
+        prim_path="{ENV_REGEX_NS}/Camera_9",
+        update_period=0.1,
+        height=480,
+        width=640,
+        data_types=["rgb"],
+        spawn=sim_utils.PinholeCameraCfg(
+            focal_length=24.0,
+            focus_distance=400.0,
+            horizontal_aperture=20.955,
+            clipping_range=(0.1, 20.0),
+        ),
+        offset=CameraCfg.OffsetCfg(
+            pos=(1.71, -0.88, 1.0),  # Camera 9 position (324 deg)
+            rot=_get_cam_quat(324.0),
+            convention="world",
+        ),
+    )
+
+    # Visual marker for main camera (so you can see where it is)
+    camera_marker = AssetBaseCfg(
+        prim_path="{ENV_REGEX_NS}/CameraMarker",
+        spawn=sim_utils.CuboidCfg(
+            size=[0.1, 0.1, 0.2],  # Small box representing camera
+            visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 0.0, 0.0)),  # Red color
+        ),
+        init_state=AssetBaseCfg.InitialStateCfg(
+            pos=(1.71, 0.88, 1.0),  # MATCHES MAIN CAMERA POSITION
+            rot=_get_cam_quat(36.0),
+        ),
+    )
+
+
+
+# Note: Camera 1 (main), Camera 3, and Camera 9 are now defined inside the class above.
+# The dynamic loop has been removed for cleaner configuration.
+
+
+def run_simulator(sim: sim_utils.SimulationContext, scene: InteractiveScene):
+    robot = scene["robot"]
+    
+    # Enable collisions for gripper to allow grasping
+    print("[INFO] Enabling gripper collisions for object interaction...")
+
+    diff_ik_cfg = DifferentialIKControllerCfg(command_type="position", use_relative_mode=True, ik_method="dls")
+    diff_ik_controller = DifferentialIKController(diff_ik_cfg, num_envs=scene.num_envs, device=sim.device)
+
+    # -----------------------
+    # SPEED / MOTION PARAMETERS
+    # Tune these to adjust arm feel.
+    # MAX_EE_STEP_M : max EE displacement per physics step (dt=0.01 s)
+    #   0.005 m/step = 0.5 m/s  (slow / safe)
+    #   0.01  m/step = 1.0 m/s  (moderate)
+    #   0.02  m/step = 2.0 m/s  (fast — default)
+    #   0.05  m/step = 5.0 m/s  (may cause IK instability near singularities)
+    # -----------------------
+    MAX_EE_STEP_M    = 0.02   # metres per physics step  → 2.0 m/s max EE speed
+    position_smoothing = 0.15
+    rotation_smoothing = 0.12
+    max_linear_velocity = 0.6
+    max_angular_velocity = 1.0
+    slow_zone_threshold = 0.15
+    min_speed_ratio = 0.1
+    position_deadband = 0.002
+    rotation_deadband = 0.02
+
+    frame_marker_cfg = FRAME_MARKER_CFG.copy()
+    frame_marker_cfg.markers["frame"].scale = (0.1, 0.1, 0.1)
+    ee_marker = VisualizationMarkers(frame_marker_cfg.replace(prim_path="/Visuals/ee_current"))
+    goal_marker = VisualizationMarkers(frame_marker_cfg.replace(prim_path="/Visuals/ee_goal"))
+
+    # WATO bimanual arm: control right arm (joint1-joint6), EE = link6
+    robot_entity_cfg = SceneEntityCfg(
+        "robot",
+        joint_names=["joint[1-6]$"],   # right arm revolute joints only
+        body_names=["link6"],           # right arm wrist / EE body
+    )
+    robot_entity_cfg.resolve(scene)
+
+    # ── DEBUG: confirm resolved IDs ──────────────────────────────────────────
+    print(f"[DEBUG] All robot joint names : {robot.data.joint_names}")
+    print(f"[DEBUG] Controlled joint IDs  : {robot_entity_cfg.joint_ids}")
+    print(f"[DEBUG] Controlled joint names: {[robot.data.joint_names[i] for i in robot_entity_cfg.joint_ids]}")
+    print(f"[DEBUG] EE body IDs           : {robot_entity_cfg.body_ids}")
+    print(f"[DEBUG] All body names        : {robot.data.body_names}")
+    print(f"[DEBUG] EE body names         : {[robot.data.body_names[i] for i in robot_entity_cfg.body_ids]}")
+    print(f"[DEBUG] is_fixed_base         : {robot.is_fixed_base}")
+    print(f"[DEBUG] ee_jacobi_idx         : {robot_entity_cfg.body_ids[0] - 1 if robot.is_fixed_base else robot_entity_cfg.body_ids[0]}")
+    # ─────────────────────────────────────────────────────────────────────────
+
+    # ── Joint role labels (based on physical observation) ────────────────────
+    # These map the controlled joint index (0-5, matching joint1-joint6) to
+    # a human-readable description of what motion that joint produces.
+    JOINT_ROLES = {
+        0: "joint1  | Shoulder Rotation       (twist about vertical axis)",
+        1: "joint2  | Shoulder Elevation      (arm up / down at shoulder)",
+        2: "joint3  | Elbow Rotation          (forearm twist / supination)",
+        3: "joint4  | Elbow Flexion           (elbow bend up / down)",
+        4: "joint5  | Forearm Rotation        (wrist-level roll)",
+        5: "joint6  | Wrist Flexion           (wrist bend up / down)",
+    }
+
+    # Periodic joint-state print counter (prints every 200 steps ≈ 2 s)
+    _periodic_debug_counter = 0
+    _PERIODIC_DEBUG_INTERVAL = 200   # steps
+
+    ee_jacobi_idx = robot_entity_cfg.body_ids[0] - 1 if robot.is_fixed_base else robot_entity_cfg.body_ids[0]
+    sim_dt = 0.01
+
+    # Initialize joint states to defaults (all zeros per URDF)
+    joint_position = robot.data.default_joint_pos.clone()
+    joint_vel = robot.data.default_joint_vel.clone()
+    robot.write_joint_state_to_sim(joint_position, joint_vel)
+    print(f"[DEBUG] default_joint_pos shape: {joint_position.shape}")
+    print(f"[DEBUG] default_joint_pos      : {joint_position}")
+
+    # -----------------------
+    # GRIPPER SETUP
+    # -----------------------
+    joint_names = [n.lower() for n in robot.data.joint_names]
+    gripper_candidates = [i for i, n in enumerate(joint_names) if any(k in n for k in ["finger", "gripper", "hand", "claw", "panda_finger"])]
+    gripper_joint_ids = gripper_candidates
+
+    if len(gripper_joint_ids) > 0:
+        print("[INFO] Detected gripper joints:", [robot.data.joint_names[i] for i in gripper_joint_ids])
+    else:
+        print("[WARN] No gripper joints detected. Gripper control will be disabled.")
+
+    gripper_open_pos = 0.04
+    gripper_closed_pos = 0.0
+    gripper_target_norm = 0.0
+    gripper_open_bool = True
+
+    def gripper_norm_to_joint_positions(norm):
+        pos = gripper_open_pos + (gripper_closed_pos - gripper_open_pos) * norm
+        if len(gripper_joint_ids) == 0:
+            return None
+        return torch.tensor([[pos] * len(gripper_joint_ids)], device=sim.device)
+
+    # ========================================================================
+    # ADD DEMONSTRATION RECORDER HERE - STEP 1: Initialize recorder
+    # ========================================================================
+    recorder = DemonstrationRecorder("demonstrations", "recordings")
+    print("[INFO] Demonstration recorder initialized")
+    
+    # Get camera from scene for video recording
+    print(f"[DEBUG] Scene keys: {list(scene.keys())}")
+    camera = scene["camera"] if "camera" in scene.keys() else None
+    if camera is not None:
+        print(f"[INFO] Camera sensor found in scene")
+        # Don't access camera.data here as it might not be initialized yet
+    else:
+        print("[WARN] No camera sensor found in scene. Video recording disabled.")
+
+    # -----------------------
+    # TELEOP / HEADLESS SETUP
+    # -----------------------
+    if not args_cli.headless:
+        try:
+            teleop = Se3Keyboard(pos_sensitivity=0.15, rot_sensitivity=0.05)
+        except TypeError:
+            try:
+                teleop = Se3Keyboard(0.05, 0.05)
+            except TypeError:
+                teleop = Se3Keyboard()
+                teleop.pos_sensitivity = 0.05
+                teleop.rot_sensitivity = 0.05
+        teleop.reset()
+        print("[INFO] Teleoperation active — use WASDQE to move and arrow keys to rotate.")
+        print("[INFO] Smooth approach enabled: arm will slow down near goal")
+        teleop_has_extra_keys = False
+
+        gripper_state = {"open": True}
+
+        def _toggle_gripper_cb():
+            gripper_state["open"] = not gripper_state["open"]
+            print(f"[UI] Gripper toggled -> {'OPEN' if gripper_state['open'] else 'CLOSED'}")
+
+        # ========================================================================
+        # ADD RECORDING CONTROLS HERE - STEP 2: Add UI buttons for recording
+        # ========================================================================
+        def _start_recording():
+            recorder.start_episode()
+            print("[RECORDING] Started new episode - perform your demonstration")
+
+        def _stop_recording():
+            recorder.end_episode()
+            print("[RECORDING] Stopped episode and saved HDF5 data")
+
+        # Create UI windows
+        gripper_window = ui.Window("Gripper", width=180, height=80)
+        with gripper_window.frame:
+            with ui.VStack(spacing=10):
+                ui.Label("Gripper Control")
+                ui.Button("Toggle Gripper", clicked_fn=_toggle_gripper_cb, height=40)
+
+        # Recording control window
+        recording_window = ui.Window("Recording", width=180, height=110, position_x=200)
+        with recording_window.frame:
+            with ui.VStack(spacing=10):
+                ui.Label("Demonstration Recording")
+                ui.Button("Start Recording", clicked_fn=_start_recording, height=40)
+                ui.Button("Stop Recording / Save", clicked_fn=_stop_recording, height=40)
+
+    else:
+        step = 0
+        headless_cycle_t = 0
+        print("[INFO] Running headless simulation with scripted motion + gripper sequence...")
+
+    #ee_pose_w = robot.data.body_state_w[:, robot_entity_cfg.body_ids[0], 0:7]
+    #goal_pose = ee_pose_w.clone()
+    #smooth_target_pose = goal_pose.clone()
+    #previous_smooth_pose = smooth_target_pose.clone()
+
+    ee_command = torch.zeros(1, 3, device=sim.device)
+    _debug_step_counter = 0   # throttle per-step IK command debug output
+
+    # Main simulation loop
+    # Initialise ee_quat_b so the first teleop frame can use it as a fallback.
+    ee_quat_b = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=sim.device)  # identity
+    while simulation_app.is_running():
+        # ------------------------------------------------------------------
+        # Read current EE pose FIRST so the teleop block can use it to
+        # rotate the keyboard delta from EE-frame into base-frame.
+        # ------------------------------------------------------------------
+        ee_pose_w   = robot.data.body_state_w[:, robot_entity_cfg.body_ids[0], 0:7]
+        root_pose_w = robot.data.root_state_w[:, 0:7]
+        joint_pos   = robot.data.joint_pos[:, robot_entity_cfg.joint_ids]
+
+        ee_pos_b, ee_quat_b = subtract_frame_transforms(
+            root_pose_w[:, 0:3], root_pose_w[:, 3:7],
+            ee_pose_w[:, 0:3],   ee_pose_w[:, 3:7]
+        )
+
+        if not args_cli.headless:
+            ret = teleop.advance()
+            if isinstance(ret, tuple) and len(ret) == 3:
+                pos_delta, rot_delta, _ = ret
+            else:
+                pos_delta, rot_delta = ret
+
+            # Zero the command each step — only non-zero when a key is held
+            ee_command = torch.zeros(1, 3, device=sim.device)
+
+            if isinstance(pos_delta, (list, tuple, np.ndarray)):
+                # Se3Keyboard outputs fixed-axis deltas in the world / robot-base
+                # frame (W always = +X, A always = +Y, Q always = +Z) regardless
+                # of where the arm is.  The IK controller expects its relative
+                # position command in the same base frame, so NO rotation is
+                # needed — just clamp the magnitude and pass through directly.
+                _delta = torch.tensor(pos_delta[:3], dtype=torch.float32, device=sim.device)
+                _step_norm = float(_delta.norm())
+                if _step_norm > 1e-8:
+                    # Clamp to MAX_EE_STEP_M (defined in the speed-params block
+                    # above) so the IK solver stays stable at any pos_sensitivity.
+                    if _step_norm > MAX_EE_STEP_M:
+                        _delta = _delta * (MAX_EE_STEP_M / _step_norm)
+                    ee_command[0, 0:3] = _delta
+                # else: command stays zero
+            gripper_open_bool = gripper_state["open"]
+            gripper_target_norm = 0.0 if gripper_open_bool else 1.0
+
+        else:
+            # Headless scripted motion — send a small sinusoidal x delta
+            ee_command = torch.zeros(1, 3, device=sim.device)
+            ee_command[0, 0] = 0.002 * float(torch.sin(torch.tensor(step * 0.1)))
+            step += 1
+
+            headless_cycle_t += 1
+            cycle_len = 200
+            tmod = headless_cycle_t % cycle_len
+            gripper_target_norm = 0.0 if tmod < (cycle_len // 2) else 1.0
+            gripper_open_bool = gripper_target_norm == 0.0
+
+        # -----------------------
+        # SMOOTH APPROACH LOGIC
+        # -----------------------
+        
+        '''position_error = goal_pose[:, 0:3] - smooth_target_pose[:, 0:3]
+        distance_to_goal = torch.norm(position_error, dim=1, keepdim=True)
+        
+        speed_scale = torch.clamp(
+            distance_to_goal / slow_zone_threshold,
+            min=min_speed_ratio,
+            max=1.0
+        )
+        
+        if distance_to_goal.item() > position_deadband:
+            position_delta = position_smoothing * speed_scale * position_error
+            max_position_delta = max_linear_velocity * sim_dt
+            position_delta_norm = torch.norm(position_delta)
+            if position_delta_norm > max_position_delta:
+                position_delta = position_delta / position_delta_norm * max_position_delta
+            smooth_target_pose[:, 0:3] += position_delta
+        
+        goal_quat = goal_pose[:, 3:7]
+        current_quat = smooth_target_pose[:, 3:7]
+        dot_product = torch.sum(goal_quat * current_quat, dim=1, keepdim=True)
+        goal_quat_corrected = torch.where(dot_product < 0, -goal_quat, goal_quat)
+        rotation_error = goal_quat_corrected - current_quat
+        rotation_error_magnitude = torch.norm(rotation_error, dim=1, keepdim=True)
+        
+        if rotation_error_magnitude.item() > rotation_deadband:
+            rotation_delta = rotation_smoothing * rotation_error
+            max_rotation_delta = max_angular_velocity * sim_dt
+            rotation_delta_norm = torch.norm(rotation_delta)
+            if rotation_delta_norm > max_rotation_delta:
+                rotation_delta = rotation_delta / rotation_delta_norm * max_rotation_delta
+            smooth_target_pose[:, 3:7] += rotation_delta
+            smooth_target_pose[:, 3:7] = smooth_target_pose[:, 3:7] / torch.norm(smooth_target_pose[:, 3:7], dim=1, keepdim=True)
+        
+        alpha = 0.7
+        smooth_target_pose = alpha * smooth_target_pose + (1 - alpha) * previous_smooth_pose
+        smooth_target_pose[:, 3:7] = smooth_target_pose[:, 3:7] / torch.norm(smooth_target_pose[:, 3:7], dim=1, keepdim=True)
+        previous_smooth_pose = smooth_target_pose.clone()'''
+
+        # IK computation — EE pose, joint_pos, ee_pos_b, ee_quat_b
+        # are already up-to-date from the top-of-loop read.
+
+        # Pass current orientation so controller can hold it fixed
+        diff_ik_controller.set_command(ee_command, ee_pos=ee_pos_b, ee_quat=ee_quat_b)
+
+        jacobian = robot.root_physx_view.get_jacobians()[:, ee_jacobi_idx, :, robot_entity_cfg.joint_ids]
+        joint_pos_des = diff_ik_controller.compute(ee_pos_b, ee_quat_b, jacobian, joint_pos)
+
+        # Clamp joint targets to ±2π to prevent runaway solutions near singularities.
+        joint_pos_des = torch.clamp(joint_pos_des, min=-2 * torch.pi, max=2 * torch.pi)
+
+        # Clamp per-step joint DELTA so no single joint moves more than 5° (0.087 rad)
+        # in one physics step. This prevents IK from commanding unreachable jumps
+        # that the physical actuators cannot track, which is the root cause of the
+        # arm not following the desired EE path.
+        _MAX_JOINT_DELTA_RAD = 0.70   # ~40° per step — large visible motion per keypress
+        _joint_delta = joint_pos_des - joint_pos
+        _delta_norm  = _joint_delta.abs().max(dim=1, keepdim=True).values
+        _scale       = torch.where(
+            _delta_norm > _MAX_JOINT_DELTA_RAD,
+            _MAX_JOINT_DELTA_RAD / _delta_norm.clamp(min=1e-6),
+            torch.ones_like(_delta_norm),
+        )
+        joint_pos_des = joint_pos + _joint_delta * _scale
+
+        # ── DEBUG: IK command block (fires on first 5 steps of each key press) ──
+        _cmd_norm = float(ee_command.norm())
+        if _cmd_norm > 1e-6 and _debug_step_counter < 5:
+            _debug_step_counter += 1
+            import math as _math
+            _jp_cur = joint_pos.cpu().numpy()[0]       # shape [6]
+            _jp_des = joint_pos_des.cpu().numpy()[0]   # shape [6]
+            _jac_norm = float(jacobian.norm())
+            print(f"\n[IK DEBUG — step {_debug_step_counter}] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
+            print(f"  EE command (base frame dx,dy,dz) : {ee_command.cpu().numpy()}")
+            print(f"  EE pos  (base frame)             : {ee_pos_b.cpu().numpy()}")
+            print(f"  EE quat (base frame, w,x,y,z)    : {ee_quat_b.cpu().numpy()}")
+            print(f"  Jacobian norm                    : {_jac_norm:.6f}  (0 → IK solver blind)")
+            print(f"  ─── Per-joint breakdown ───────────────────────────────────────────")
+            for _ji in range(len(_jp_cur)):
+                _role  = JOINT_ROLES.get(_ji, f"joint{_ji+1}  | (unknown role)")
+                _cur_d = _math.degrees(_jp_cur[_ji])
+                _des_d = _math.degrees(_jp_des[_ji])
+                _dlt_d = _des_d - _cur_d
+                _flag  = "  ← large jump!" if abs(_dlt_d) > 20.0 else ""
+                print(f"  [{_ji}] {_role}")
+                print(f"      current : {_jp_cur[_ji]:+.4f} rad  ({_cur_d:+7.2f}°)")
+                print(f"      desired : {_jp_des[_ji]:+.4f} rad  ({_des_d:+7.2f}°)")
+                print(f"      delta   : {_jp_des[_ji]-_jp_cur[_ji]:+.4f} rad  ({_dlt_d:+7.2f}°){_flag}")
+            print(f"  ───────────────────────────────────────────────────────────────────")
+        elif _cmd_norm < 1e-6:
+            _debug_step_counter = 0  # reset counter when keys released
+        # ──────────────────────────────────────────────────────────────────────────
+
+        # ── PERIODIC joint state snapshot (every ~2 s even when idle) ─────────
+        _periodic_debug_counter += 1
+        if _periodic_debug_counter >= _PERIODIC_DEBUG_INTERVAL:
+            _periodic_debug_counter = 0
+            import math as _math
+            _jp_now = robot.data.joint_pos[:, robot_entity_cfg.joint_ids].cpu().numpy()[0]
+            _jv_now = robot.data.joint_vel[:, robot_entity_cfg.joint_ids].cpu().numpy()[0]
+            _ee_now = ee_pose_w.cpu().numpy()[0]
+            print(f"\n[JOINT SNAPSHOT] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
+            print(f"  EE world pos (x,y,z) : ({_ee_now[0]:+.4f}, {_ee_now[1]:+.4f}, {_ee_now[2]:+.4f})")
+            print(f"  EE world quat(w,x,y,z): ({_ee_now[3]:+.4f}, {_ee_now[4]:+.4f}, {_ee_now[5]:+.4f}, {_ee_now[6]:+.4f})")
+            print(f"  ─── Controlled joints (right arm joint1-joint6) ────────────────────")
+            for _ji in range(len(_jp_now)):
+                _role = JOINT_ROLES.get(_ji, f"joint{_ji+1}")
+                _ang  = _math.degrees(_jp_now[_ji])
+                _vel  = _math.degrees(_jv_now[_ji])
+                print(f"  [{_ji}] {_role}")
+                print(f"      pos : {_jp_now[_ji]:+.4f} rad ({_ang:+7.2f}°)   vel: {_jv_now[_ji]:+.4f} rad/s ({_vel:+7.2f}°/s)")
+            print(f"  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n")
+        # ──────────────────────────────────────────────────────────────────────────
+
+        # Set arm joint targets
+        robot.set_joint_position_target(joint_pos_des, joint_ids=robot_entity_cfg.joint_ids)
+
+        # Apply gripper joint targets
+        if len(gripper_joint_ids) > 0:
+            gripper_joint_positions = gripper_norm_to_joint_positions(gripper_target_norm)
+            if gripper_joint_positions is not None:
+                robot.set_joint_position_target(
+                    gripper_joint_positions,
+                    joint_ids=gripper_joint_ids
+                )
+
+        # ========================================================================
+        # RECORD DEMONSTRATION DATA HERE - STEP 3: Record transitions
+        # ========================================================================
+        # Create observation (matching what you'll use for training)
+        joint_pos_full = robot.data.joint_pos[:, robot_entity_cfg.joint_ids]
+        joint_vel_full = robot.data.joint_vel[:, robot_entity_cfg.joint_ids]
+        
+        # Observation: [joint_pos(7), joint_vel(7), ee_pose(7), gripper_state(1)]
+        obs = torch.cat([
+            joint_pos_full,
+            joint_vel_full,
+            ee_pose_w,
+            torch.tensor([[gripper_target_norm]], device=sim.device)
+        ], dim=-1)
+        
+        # Action: joint position targets + gripper target
+        action = torch.cat([
+            ee_command,  # 3D: [dx, dy, dz]
+            torch.tensor([[gripper_target_norm]], device=sim.device)
+        ], dim=-1)
+        
+        # Record the transition
+        # Record the transition
+        # Capture frames from all cameras
+        video_frames_dict = {}
+        
+        # Helper to capture from a single camera sensor
+        def _get_camera_frame(sensor):
+            if sensor is None: return None
+            try:
+                rgb_data = sensor.data.output["rgb"]
+                if rgb_data is None: return None
+                
+                # Handle different tensor formats
+                if hasattr(rgb_data, 'shape') and len(rgb_data.shape) >= 3:
+                     # Get first environment if batched
+                     frame_data = rgb_data[0] if len(rgb_data.shape) == 4 else rgb_data
+                     
+                     # Convert to numpy
+                     if hasattr(frame_data, 'cpu'):
+                         v_frame = frame_data.clone().cpu().numpy()
+                     elif hasattr(frame_data, 'numpy'):
+                         v_frame = frame_data.numpy().copy()
+                     else:
+                         v_frame = np.asarray(frame_data).copy()
+                         
+                     # Ensure uint8
+                     if v_frame.dtype != np.uint8:
+                         v_frame = (v_frame * 255).astype(np.uint8) if v_frame.max() <= 1.0 else v_frame.astype(np.uint8)
+                     return v_frame
+            except Exception as e:
+                # Use a flag to print error only once per sensor
+                if not hasattr(sensor, '_error_printed'):
+                    print(f"[WARN] Frame capture error for {sensor}: {e}")
+                    sensor._error_printed = True
+            return None
+
+        # Iterate over all likely camera sensors in the scene
+        for key in scene.keys():
+            # Skip markers, look for 'camera' in name (case insensitive)
+            if "camera" in key.lower() and "marker" not in key.lower():
+                try:
+                    sensor = scene[key]
+                    # Duck typing check for data.output
+                    if hasattr(sensor, 'data') and hasattr(sensor.data, 'output'):
+                         frame = _get_camera_frame(sensor)
+                         if frame is not None:
+                             video_frames_dict[key] = frame
+                except:
+                    pass
+        
+        recorder.add_transition(
+            obs=obs,
+            action=action,
+            ee_pose=ee_pose_w,
+            joint_pos=joint_pos_full,
+            video_frames_dict=video_frames_dict
+        )
+
+        # Write/update simulation
+        scene.write_data_to_sim()
+        sim.step()
+        scene.update(sim_dt)
+
+        # Visualize EE & goal
+        ee_pose_w = robot.data.body_state_w[:, robot_entity_cfg.body_ids[0], 0:7]
+        ee_marker.visualize(ee_pose_w[:, 0:3], ee_pose_w[:, 3:7])
+        #goal_marker.visualize(goal_pose[:, 0:3] + scene.env_origins, goal_pose[:, 3:7])
+
+def main():
+    sim_cfg = sim_utils.SimulationCfg(dt=0.01, device=args_cli.device)
+    sim = sim_utils.SimulationContext(sim_cfg)
+
+    sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
+    scene_cfg = TableTopSceneCfg(num_envs=1, env_spacing=2.0)
+    scene = InteractiveScene(scene_cfg)
+
+    sim.reset()
+    print("[INFO]: Setup complete...")
+
+    run_simulator(sim, scene)
+
+
+if __name__ == "__main__":
+    main()
+    simulation_app.close()
\ No newline at end of file
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/events.out.tfevents.1781115023.937e70f2dc01.1256.0 b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/events.out.tfevents.1781115023.937e70f2dc01.1256.0
new file mode 100644
index 00000000..2eb4e5b4
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/events.out.tfevents.1781115023.937e70f2dc01.1256.0 differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/exported/policy.onnx b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/exported/policy.onnx
new file mode 100644
index 00000000..9f7bcf98
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/exported/policy.onnx differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/exported/policy.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/exported/policy.pt
new file mode 100644
index 00000000..7839ac46
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/exported/policy.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/git/humanoid.diff b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/git/humanoid.diff
new file mode 100644
index 00000000..1c4ab094
--- /dev/null
+++ b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/git/humanoid.diff
@@ -0,0 +1,258 @@
+--- git status ---
+On branch reinforcement-learning-scripts
+Changes not staged for commit:
+  (use "git add/rm <file>..." to update what will be committed)
+  (use "git restore <file>..." to discard changes in working directory)
+	deleted:    autonomy/il/humanoid_il.egg-info/PKG-INFO
+	deleted:    autonomy/il/humanoid_il.egg-info/SOURCES.txt
+	deleted:    autonomy/il/humanoid_il.egg-info/dependency_links.txt
+	deleted:    autonomy/il/humanoid_il.egg-info/entry_points.txt
+	deleted:    autonomy/il/humanoid_il.egg-info/requires.txt
+	deleted:    autonomy/il/humanoid_il.egg-info/top_level.txt
+	deleted:    autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/bimanual_cabinet_demo.mp4
+	deleted:    autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/demo_2026-06-10_07-14-20-step-0.meta.json
+
+Untracked files:
+  (use "git add <file>..." to include in what will be committed)
+	trained_models/bimanual_cabinet/2026-06-10_18-09-19/
+
+no changes added to commit (use "git add" and/or "git commit -a") 
+
+
+--- git diff ---
+diff --git a/autonomy/il/humanoid_il.egg-info/PKG-INFO b/autonomy/il/humanoid_il.egg-info/PKG-INFO
+deleted file mode 100644
+index 2369a84..0000000
+--- a/autonomy/il/humanoid_il.egg-info/PKG-INFO
++++ /dev/null
+@@ -1,140 +0,0 @@
+-Metadata-Version: 2.4
+-Name: humanoid-il
+-Version: 0.1.0
+-Summary: Dataset recording for the WATO humanoid arm (ROS 2 + Isaac Sim)
+-Requires-Python: >=3.10
+-Description-Content-Type: text/markdown
+-Requires-Dist: numpy>=1.24
+-Requires-Dist: pyyaml>=6.0
+-Provides-Extra: lerobot
+-Requires-Dist: lerobot>=0.3.0; extra == "lerobot"
+-Provides-Extra: keyboard
+-Requires-Dist: pynput>=1.7; extra == "keyboard"
+-Provides-Extra: ros
+-Requires-Dist: opencv-python-headless>=4.8; extra == "ros"
+-Provides-Extra: hdf5
+-Requires-Dist: h5py>=3.8; extra == "hdf5"
+-Provides-Extra: record
+-Requires-Dist: lerobot>=0.3.0; extra == "record"
+-Requires-Dist: pynput>=1.7; extra == "record"
+-Requires-Dist: h5py>=3.8; extra == "record"
+-
+-# Imitation learning: record (`humanoid-record`)
+-
+-Record teleoperation into **LeRobot** and/or **HDF5 (LeWM)** datasets. One shared frame contract for the **real arm (ROS 2)** and **Isaac Sim keyboard teleop**.
+-
+-Training stays outside this repo (`lerobot-train` on the LeRobot folder). HDF5 output is compatible with [LeWorldModel / stable-worldmodel](https://galilai-group.github.io/stable-worldmodel/) style loaders (`action`, `proprio`, `pixels`, `ep_len`, `ep_offset`).
+-
+-## Data contract
+-
+-**6-DOF left arm** — same joint order as `joint_command_core.cpp`:
+-
+-1. shoulder pitch, roll, yaw  
+-2. elbow pitch, roll  
+-3. wrist pitch  
+-
+-| Field | Source | Units |
+-|-------|--------|-------|
+-| `observation.state` / `proprio` | measured joint positions | rad |
+-| `action` | commanded joint targets (IK output on sim, `/behaviour/arm_pose` on robot) | rad |
+-| `observation.images.*` / `pixels` | cameras in schema (optional for sim) | uint8 |
+-| `task` | `--task_description` | string |
+-
+-## Layout
+-
+-```
+-autonomy/il/
+-├── config/
+-│   ├── dataset_schema.yaml       # real robot (ROS + wrist camera)
+-│   └── dataset_schema_sim.yaml   # Isaac sim (joint-only, no camera)
+-├── humanoid_il/
+-│   ├── snapshot.py               # ObservationSnapshot
+-│   ├── frame.py                  # build_lerobot_frame()
+-│   ├── recorder.py               # RecordSession (episode flags + sinks)
+-│   ├── record_loop.py            # blocking loop for ROS CLI
+-│   ├── sim_session.py            # helper for Isaac sim scripts
+-│   ├── sinks/
+-│   │   ├── lerobot.py            # Parquet + MP4
+-│   │   └── hdf5_lewm.py          # trajectories.h5
+-│   ├── schema.py
+-│   ├── record.py                 # humanoid-record CLI
+-│   └── ros_buffer.py
+-└── README.md
+-```
+-
+-## Install
+-
+-```bash
+-cd autonomy/il
+-pip install -e ".[record]"          # real arm + sim recording
+-pip install -e ".[record,ros]"      # + ROS image decoding
+-```
+-
+-## Real robot (ROS 2)
+-
+-```bash
+-source /path/to/humanoid/install/setup.bash
+-
+-humanoid-record \
+-  --task_description "reach forward" \
+-  --num_episodes 10 \
+-  --sink lerobot
+-```
+-
+-Both formats:
+-
+-```bash
+-humanoid-record --sink lerobot,hdf5 --num_episodes 10
+-```
+-
+-**Keyboard** (needs `pynput`):
+-
+-| Key | Effect |
+-|-----|--------|
+-| S | Start logging frames for this episode |
+-| N | Finish episode → `save_episode` |
+-| D | Discard buffer, re-record same episode |
+-| Esc | Abort and `finalize` |
+-
+-**Dry run** (no robot):
+-
+-```bash
+-humanoid-record --dry_run --sink lerobot,hdf5 --num_episodes 2 --episode_time_s 3
+-```
+-
+-Output: `datasets/record/001/` with LeRobot tree + `trajectories.h5`.
+-
+-## Isaac Sim (keyboard teleop)
+-
+-From `autonomy/simulation/Teleop/keyboard-based teleoperation/`:
+-
+-```bash
+-pip install -e ../../../il[record]
+-
+-PYTHONPATH=$(pwd) <path>/IsaacLab/isaaclab.sh -p keyboard_teleop.py --record \
+-  --sink lerobot,hdf5 \
+-  --num_episodes 5 \
+-  --task_description "reach and grasp"
+-```
+-
+-Uses `config/dataset_schema_sim.yaml` (joint-only, no camera). Same S/N/D/Esc keys as real-arm recording.
+-
+-Output: `datasets/record_sim/001/`.
+-
+-## Train (LeRobot, external)
+-
+-```bash
+-lerobot-train \
+-  --dataset.repo_id=humanoid/local_left_arm \
+-  --dataset.root=datasets/record/001 \
+-  --policy.type=act \
+-  --output_dir=outputs/train/humanoid_act_v1
+-```
+-
+-## Sinks
+-
+-| `--sink` | Output | Use case |
+-|----------|--------|----------|
+-| `lerobot` | `data/`, `videos/`, `meta/` | `lerobot-train`, HuggingFace Hub |
+-| `hdf5` | `trajectories.h5` | LeWM / JEPA world-model pipelines |
+-| `lerobot,hdf5` | both under same `001/` folder | sim validation + BC training |
+diff --git a/autonomy/il/humanoid_il.egg-info/SOURCES.txt b/autonomy/il/humanoid_il.egg-info/SOURCES.txt
+deleted file mode 100644
+index f889ff1..0000000
+--- a/autonomy/il/humanoid_il.egg-info/SOURCES.txt
++++ /dev/null
+@@ -1,24 +0,0 @@
+-README.md
+-pyproject.toml
+-humanoid_il/__init__.py
+-humanoid_il/arm_pose_io.py
+-humanoid_il/episode_keys.py
+-humanoid_il/frame.py
+-humanoid_il/observation.py
+-humanoid_il/record.py
+-humanoid_il/record_loop.py
+-humanoid_il/record_utils.py
+-humanoid_il/recorder.py
+-humanoid_il/ros_buffer.py
+-humanoid_il/schema.py
+-humanoid_il/sim_session.py
+-humanoid_il/snapshot.py
+-humanoid_il.egg-info/PKG-INFO
+-humanoid_il.egg-info/SOURCES.txt
+-humanoid_il.egg-info/dependency_links.txt
+-humanoid_il.egg-info/entry_points.txt
+-humanoid_il.egg-info/requires.txt
+-humanoid_il.egg-info/top_level.txt
+-humanoid_il/sinks/__init__.py
+-humanoid_il/sinks/hdf5_lewm.py
+-humanoid_il/sinks/lerobot.py
+\ No newline at end of file
+diff --git a/autonomy/il/humanoid_il.egg-info/dependency_links.txt b/autonomy/il/humanoid_il.egg-info/dependency_links.txt
+deleted file mode 100644
+index 8b13789..0000000
+--- a/autonomy/il/humanoid_il.egg-info/dependency_links.txt
++++ /dev/null
+@@ -1 +0,0 @@
+-
+diff --git a/autonomy/il/humanoid_il.egg-info/entry_points.txt b/autonomy/il/humanoid_il.egg-info/entry_points.txt
+deleted file mode 100644
+index c25975e..0000000
+--- a/autonomy/il/humanoid_il.egg-info/entry_points.txt
++++ /dev/null
+@@ -1,2 +0,0 @@
+-[console_scripts]
+-humanoid-record = humanoid_il.record:main
+diff --git a/autonomy/il/humanoid_il.egg-info/requires.txt b/autonomy/il/humanoid_il.egg-info/requires.txt
+deleted file mode 100644
+index 1fc4e75..0000000
+--- a/autonomy/il/humanoid_il.egg-info/requires.txt
++++ /dev/null
+@@ -1,19 +0,0 @@
+-numpy>=1.24
+-pyyaml>=6.0
+-
+-[hdf5]
+-h5py>=3.8
+-
+-[keyboard]
+-pynput>=1.7
+-
+-[lerobot]
+-lerobot>=0.3.0
+-
+-[record]
+-lerobot>=0.3.0
+-pynput>=1.7
+-h5py>=3.8
+-
+-[ros]
+-opencv-python-headless>=4.8
+diff --git a/autonomy/il/humanoid_il.egg-info/top_level.txt b/autonomy/il/humanoid_il.egg-info/top_level.txt
+deleted file mode 100644
+index ad2fea9..0000000
+--- a/autonomy/il/humanoid_il.egg-info/top_level.txt
++++ /dev/null
+@@ -1 +0,0 @@
+-humanoid_il
+diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/bimanual_cabinet_demo.mp4 b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/bimanual_cabinet_demo.mp4
+deleted file mode 100644
+index cb973cc..0000000
+Binary files a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/bimanual_cabinet_demo.mp4 and /dev/null differ
+diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/demo_2026-06-10_07-14-20-step-0.meta.json b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/demo_2026-06-10_07-14-20-step-0.meta.json
+deleted file mode 100644
+index 03b3865..0000000
+--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/bimanual_cabinet/videos/demo_2026-06-10_07-14-20-step-0.meta.json
++++ /dev/null
+@@ -1 +0,0 @@
+-{"step_id": 0, "episode_id": 0, "content_type": "video/mp4"}
+\ No newline at end of file
\ No newline at end of file
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_0.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_0.pt
new file mode 100644
index 00000000..9c378051
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_0.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_100.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_100.pt
new file mode 100644
index 00000000..73d51a7f
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_100.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_150.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_150.pt
new file mode 100644
index 00000000..c2ce4506
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_150.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_200.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_200.pt
new file mode 100644
index 00000000..89c8ad6e
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_200.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_250.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_250.pt
new file mode 100644
index 00000000..01ad9f11
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_250.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_300.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_300.pt
new file mode 100644
index 00000000..bda89749
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_300.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_350.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_350.pt
new file mode 100644
index 00000000..b5502bf3
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_350.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_400.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_400.pt
new file mode 100644
index 00000000..5231b245
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_400.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_450.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_450.pt
new file mode 100644
index 00000000..5163dc1d
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_450.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_499.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_499.pt
new file mode 100644
index 00000000..1fc3e17b
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_499.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_50.pt b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_50.pt
new file mode 100644
index 00000000..1fd42cd1
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/model_50.pt differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/agent.pkl b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/agent.pkl
new file mode 100644
index 00000000..dbae32ca
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/agent.pkl differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/agent.yaml b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/agent.yaml
new file mode 100644
index 00000000..fcdfad28
--- /dev/null
+++ b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/agent.yaml
@@ -0,0 +1,41 @@
+seed: 42
+device: cuda:0
+num_steps_per_env: 96
+max_iterations: 500
+empirical_normalization: true
+policy:
+  class_name: ActorCritic
+  init_noise_std: 1.0
+  actor_hidden_dims:
+  - 256
+  - 128
+  - 64
+  critic_hidden_dims:
+  - 256
+  - 128
+  - 64
+  activation: elu
+algorithm:
+  class_name: PPO
+  value_loss_coef: 1.0
+  use_clipped_value_loss: true
+  clip_param: 0.2
+  entropy_coef: 0.0
+  num_learning_epochs: 5
+  num_mini_batches: 4
+  learning_rate: 0.0005
+  schedule: adaptive
+  gamma: 0.99
+  lam: 0.95
+  desired_kl: 0.02
+  max_grad_norm: 1.0
+clip_actions: null
+save_interval: 50
+experiment_name: bimanual_cabinet
+run_name: ''
+logger: tensorboard
+neptune_project: isaaclab
+wandb_project: isaaclab
+resume: false
+load_run: .*
+load_checkpoint: model_.*.pt
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/env.pkl b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/env.pkl
new file mode 100644
index 00000000..bb3cce1b
Binary files /dev/null and b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/env.pkl differ
diff --git a/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/env.yaml b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/env.yaml
new file mode 100644
index 00000000..c61e649f
--- /dev/null
+++ b/trained_models/bimanual_cabinet/2026-06-10_18-09-19/params/env.yaml
@@ -0,0 +1,1006 @@
+viewer:
+  eye: !!python/tuple
+  - -2.0
+  - 2.0
+  - 2.0
+  lookat: !!python/tuple
+  - 0.8
+  - 0.0
+  - 0.5
+  cam_prim_path: /OmniverseKit_Persp
+  resolution: !!python/tuple
+  - 1280
+  - 720
+  origin_type: world
+  env_index: 0
+  asset_name: null
+  body_name: null
+sim:
+  physics_prim_path: /physicsScene
+  device: cuda:0
+  dt: 0.016666666666666666
+  render_interval: 1
+  gravity: !!python/tuple
+  - 0.0
+  - 0.0
+  - -9.81
+  enable_scene_query_support: false
+  use_fabric: true
+  physx:
+    solver_type: 1
+    min_position_iteration_count: 1
+    max_position_iteration_count: 255
+    min_velocity_iteration_count: 0
+    max_velocity_iteration_count: 255
+    enable_ccd: false
+    enable_stabilization: true
+    enable_enhanced_determinism: false
+    bounce_threshold_velocity: 0.01
+    friction_offset_threshold: 0.04
+    friction_correlation_distance: 0.00625
+    gpu_max_rigid_contact_count: 8388608
+    gpu_max_rigid_patch_count: 163840
+    gpu_found_lost_pairs_capacity: 2097152
+    gpu_found_lost_aggregate_pairs_capacity: 33554432
+    gpu_total_aggregate_pairs_capacity: 2097152
+    gpu_collision_stack_size: 67108864
+    gpu_heap_capacity: 67108864
+    gpu_temp_buffer_capacity: 16777216
+    gpu_max_num_partitions: 8
+    gpu_max_soft_body_contacts: 1048576
+    gpu_max_particle_contacts: 1048576
+  physics_material:
+    func: isaaclab.sim.spawners.materials.physics_materials:spawn_rigid_body_material
+    static_friction: 0.5
+    dynamic_friction: 0.5
+    restitution: 0.0
+    improve_patch_friction: true
+    friction_combine_mode: average
+    restitution_combine_mode: average
+    compliant_contact_stiffness: 0.0
+    compliant_contact_damping: 0.0
+  render:
+    enable_translucency: null
+    enable_reflections: null
+    enable_global_illumination: null
+    antialiasing_mode: null
+    enable_dlssg: null
+    enable_dl_denoiser: null
+    dlss_mode: null
+    enable_direct_lighting: null
+    samples_per_pixel: null
+    enable_shadows: null
+    enable_ambient_occlusion: null
+ui_window_class_type: isaaclab.envs.ui.manager_based_rl_env_window:ManagerBasedRLEnvWindow
+seed: 42
+decimation: 1
+scene:
+  num_envs: 4096
+  env_spacing: 2.0
+  lazy_sensor_update: true
+  replicate_physics: true
+  filter_collisions: true
+  robot:
+    class_type: isaaclab.assets.articulation.articulation:Articulation
+    prim_path: /World/envs/env_.*/Robot
+    spawn:
+      asset_path: /workspace/isaaclab/final_repo/humanoid/autonomy/simulation/Humanoid_Wato/wato_bimanual_arm/urdf/armDouble.SLDASM.urdf
+      usd_dir: null
+      usd_file_name: null
+      force_usd_conversion: false
+      make_instanceable: true
+      fix_base: true
+      root_link_name: null
+      link_density: 0.0
+      merge_fixed_joints: true
+      convert_mimic_joints_to_normal_joints: false
+      joint_drive:
+        drive_type: force
+        target_type: position
+        gains:
+          stiffness: 400.0
+          damping: 40.0
+      collider_type: convex_hull
+      self_collision: false
+      replace_cylinders_with_capsules: false
+      collision_from_visuals: false
+      func: isaaclab.sim.spawners.from_files.from_files:spawn_from_urdf
+      visible: true
+      semantic_tags: null
+      copy_from_source: true
+      mass_props: null
+      deformable_props: null
+      rigid_props:
+        rigid_body_enabled: null
+        kinematic_enabled: null
+        disable_gravity: false
+        linear_damping: null
+        angular_damping: null
+        max_linear_velocity: null
+        max_angular_velocity: null
+        max_depenetration_velocity: 5.0
+        max_contact_impulse: null
+        enable_gyroscopic_forces: null
+        retain_accelerations: null
+        solver_position_iteration_count: null
+        solver_velocity_iteration_count: null
+        sleep_threshold: null
+        stabilization_threshold: null
+      collision_props: null
+      activate_contact_sensors: false
+      scale: null
+      articulation_props:
+        articulation_enabled: null
+        enabled_self_collisions: false
+        solver_position_iteration_count: null
+        solver_velocity_iteration_count: null
+        sleep_threshold: null
+        stabilization_threshold: null
+        fix_root_link: null
+      fixed_tendons_props: null
+      joint_drive_props: null
+      visual_material_path: material
+      visual_material: null
+    init_state:
+      pos: !!python/tuple
+      - -0.1
+      - 0.0
+      - 0.4
+      rot: !!python/tuple
+      - 1.0
+      - 0.0
+      - 0.0
+      - 0.0
+      lin_vel: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      ang_vel: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      joint_pos:
+        joint1: -2.4574235868080163
+        joint2: 0.9721483933608416
+        joint3: -1.1519173063162575
+        joint4: 1.9442967867216832
+        joint5: 0.6073745796940266
+        joint6: 0.061086523819801536
+        joint7: -0.05
+        joint8: 0.05
+        joint1L: 2.4294983187761066
+        joint2l: 1.1536626355682518
+        joint3l: 2.5813419636996136
+        joint4l: -1.335176877775662
+        joint5l: -1.335176877775662
+        joint6l: -0.39444441095071847
+        joint7l: -0.05
+        joint8l: 0.05
+      joint_vel:
+        .*: 0.0
+    collision_group: 0
+    debug_vis: false
+    soft_joint_pos_limit_factor: 1.0
+    actuators:
+      left_shoulder:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - joint1L
+        - joint2l
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 18.0
+        velocity_limit_sim: 3.0
+        stiffness: 757.6
+        damping: 60.3
+        armature: null
+        friction: null
+      left_elbow:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - joint3l
+        - joint4l
+        - joint5l
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 9.0
+        velocity_limit_sim: 3.0
+        stiffness: 615.5
+        damping: 43.5
+        armature: null
+        friction: null
+      left_wrist:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - joint6l
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 0.25
+        velocity_limit_sim: 3.0
+        stiffness: 170.5
+        damping: 9.0
+        armature: null
+        friction: null
+      left_gripper:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - joint7l
+        - joint8l
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 30.0
+        velocity_limit_sim: 0.2
+        stiffness: 400.0
+        damping: 40.0
+        armature: null
+        friction: null
+      right_arm:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - joint1
+        - joint2
+        - joint3
+        - joint4
+        - joint5
+        - joint6
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 18.0
+        velocity_limit_sim: 3.0
+        stiffness: 500.0
+        damping: 50.0
+        armature: null
+        friction: null
+      right_gripper:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - joint7
+        - joint8
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 30.0
+        velocity_limit_sim: 0.2
+        stiffness: 400.0
+        damping: 40.0
+        armature: null
+        friction: null
+  cabinet:
+    class_type: isaaclab.assets.articulation.articulation:Articulation
+    prim_path: /World/envs/env_.*/Cabinet
+    spawn:
+      func: isaaclab.sim.spawners.from_files.from_files:spawn_from_usd
+      visible: true
+      semantic_tags: null
+      copy_from_source: true
+      mass_props: null
+      deformable_props: null
+      rigid_props: null
+      collision_props: null
+      activate_contact_sensors: false
+      scale: !!python/tuple
+      - 1.15
+      - 1.15
+      - 1.15
+      articulation_props: null
+      fixed_tendons_props: null
+      joint_drive_props: null
+      visual_material_path: material
+      visual_material: null
+      usd_path: http://omniverse-content-production.s3-us-west-2.amazonaws.com/Assets/Isaac/4.5/Isaac/Props/Sektion_Cabinet/sektion_cabinet_instanceable.usd
+      variants: null
+    init_state:
+      pos: !!python/tuple
+      - 0.75
+      - 0.0
+      - 0.6
+      rot: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      - 1.0
+      lin_vel: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      ang_vel: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      joint_pos:
+        door_left_joint: 0.0
+        door_right_joint: 0.0
+        drawer_bottom_joint: 0.0
+        drawer_top_joint: 0.0
+      joint_vel:
+        .*: 0.0
+    collision_group: 0
+    debug_vis: false
+    soft_joint_pos_limit_factor: 1.0
+    actuators:
+      drawers:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - drawer_top_joint
+        - drawer_bottom_joint
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 87.0
+        velocity_limit_sim: null
+        stiffness: 10.0
+        damping: 1.0
+        armature: null
+        friction: null
+      doors:
+        class_type: isaaclab.actuators.actuator_pd:ImplicitActuator
+        joint_names_expr:
+        - door_left_joint
+        - door_right_joint
+        effort_limit: null
+        velocity_limit: null
+        effort_limit_sim: 87.0
+        velocity_limit_sim: null
+        stiffness: 10.0
+        damping: 2.5
+        armature: null
+        friction: null
+  cabinet_frame:
+    class_type: isaaclab.sensors.frame_transformer.frame_transformer:FrameTransformer
+    prim_path: /World/envs/env_.*/Cabinet/sektion
+    update_period: 0.0
+    history_length: 0
+    debug_vis: true
+    source_frame_offset:
+      pos: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      rot: !!python/tuple
+      - 1.0
+      - 0.0
+      - 0.0
+      - 0.0
+    target_frames:
+    - prim_path: /World/envs/env_.*/Cabinet/drawer_handle_top
+      name: drawer_handle_top
+      offset:
+        pos: !!python/tuple
+        - 0.3485
+        - 0.0
+        - 0.01
+        rot: !!python/tuple
+        - 0.5
+        - 0.5
+        - -0.5
+        - -0.5
+    visualizer_cfg:
+      prim_path: /Visuals/CabinetFrameTransformer
+      markers:
+        frame:
+          func: isaaclab.sim.spawners.from_files.from_files:spawn_from_usd
+          visible: true
+          semantic_tags: null
+          copy_from_source: true
+          mass_props: null
+          deformable_props: null
+          rigid_props: null
+          collision_props: null
+          activate_contact_sensors: false
+          scale: !!python/tuple
+          - 0.1
+          - 0.1
+          - 0.1
+          articulation_props: null
+          fixed_tendons_props: null
+          joint_drive_props: null
+          visual_material_path: material
+          visual_material: null
+          usd_path: http://omniverse-content-production.s3-us-west-2.amazonaws.com/Assets/Isaac/4.5/Isaac/Props/UIElements/frame_prim.usd
+          variants: null
+  plane:
+    class_type: null
+    prim_path: /World/GroundPlane
+    spawn:
+      func: isaaclab.sim.spawners.from_files.from_files:spawn_ground_plane
+      visible: true
+      semantic_tags: null
+      copy_from_source: true
+      usd_path: http://omniverse-content-production.s3-us-west-2.amazonaws.com/Assets/Isaac/4.5/Isaac/Environments/Grid/default_environment.usd
+      color: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      size: !!python/tuple
+      - 100.0
+      - 100.0
+      physics_material:
+        func: isaaclab.sim.spawners.materials.physics_materials:spawn_rigid_body_material
+        static_friction: 0.5
+        dynamic_friction: 0.5
+        restitution: 0.0
+        improve_patch_friction: true
+        friction_combine_mode: average
+        restitution_combine_mode: average
+        compliant_contact_stiffness: 0.0
+        compliant_contact_damping: 0.0
+    init_state:
+      pos: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      rot: !!python/tuple
+      - 1.0
+      - 0.0
+      - 0.0
+      - 0.0
+    collision_group: -1
+    debug_vis: false
+  light:
+    class_type: null
+    prim_path: /World/light
+    spawn:
+      func: isaaclab.sim.spawners.lights.lights:spawn_light
+      visible: true
+      semantic_tags: null
+      copy_from_source: true
+      prim_type: DomeLight
+      color: !!python/tuple
+      - 0.75
+      - 0.75
+      - 0.75
+      enable_color_temperature: false
+      color_temperature: 6500.0
+      normalize: false
+      exposure: 0.0
+      intensity: 3000.0
+      texture_file: null
+      texture_format: automatic
+      visible_in_primary_ray: true
+    init_state:
+      pos: !!python/tuple
+      - 0.0
+      - 0.0
+      - 0.0
+      rot: !!python/tuple
+      - 1.0
+      - 0.0
+      - 0.0
+      - 0.0
+    collision_group: 0
+    debug_vis: false
+recorders:
+  dataset_file_handler_class_type: isaaclab.utils.datasets.hdf5_dataset_file_handler:HDF5DatasetFileHandler
+  dataset_export_dir_path: /tmp/isaaclab/logs
+  dataset_filename: dataset
+  dataset_export_mode:
+    _value_: 1
+    _name_: EXPORT_ALL
+  export_in_record_pre_reset: true
+observations:
+  policy:
+    concatenate_terms: true
+    enable_corruption: true
+    history_length: null
+    flatten_history_dim: true
+    joint_pos:
+      func: isaaclab.envs.mdp.observations:joint_pos_rel
+      params: {}
+      modifiers: null
+      noise: null
+      clip: null
+      scale: null
+      history_length: 0
+      flatten_history_dim: true
+    joint_vel:
+      func: isaaclab.envs.mdp.observations:joint_vel_rel
+      params: {}
+      modifiers: null
+      noise: null
+      clip: null
+      scale: null
+      history_length: 0
+      flatten_history_dim: true
+    cabinet_joint_pos:
+      func: isaaclab.envs.mdp.observations:joint_pos_rel
+      params:
+        asset_cfg:
+          name: cabinet
+          joint_names:
+          - drawer_top_joint
+          joint_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          fixed_tendon_names: null
+          fixed_tendon_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          body_names: null
+          body_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          object_collection_names: null
+          object_collection_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          preserve_order: false
+      modifiers: null
+      noise: null
+      clip: null
+      scale: null
+      history_length: 0
+      flatten_history_dim: true
+    cabinet_joint_vel:
+      func: isaaclab.envs.mdp.observations:joint_vel_rel
+      params:
+        asset_cfg:
+          name: cabinet
+          joint_names:
+          - drawer_top_joint
+          joint_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          fixed_tendon_names: null
+          fixed_tendon_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          body_names: null
+          body_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          object_collection_names: null
+          object_collection_ids: !!python/object/apply:builtins.slice
+          - null
+          - null
+          - null
+          preserve_order: false
+      modifiers: null
+      noise: null
+      clip: null
+      scale: null
+      history_length: 0
+      flatten_history_dim: true
+    rel_ee_drawer_distance:
+      func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.observations:rel_ee_drawer_distance
+      params: {}
+      modifiers: null
+      noise: null
+      clip: null
+      scale: null
+      history_length: 0
+      flatten_history_dim: true
+    actions:
+      func: isaaclab.envs.mdp.observations:last_action
+      params: {}
+      modifiers: null
+      noise: null
+      clip: null
+      scale: null
+      history_length: 0
+      flatten_history_dim: true
+actions:
+  arm_action:
+    class_type: isaaclab.envs.mdp.actions.joint_actions:JointPositionAction
+    asset_name: robot
+    debug_vis: false
+    clip: null
+    joint_names:
+    - joint1
+    - joint2
+    - joint3
+    - joint4
+    - joint5
+    - joint6
+    scale: 0.8
+    offset: 0.0
+    preserve_order: false
+    use_default_offset: true
+  gripper_action:
+    class_type: isaaclab.envs.mdp.actions.binary_joint_actions:BinaryJointPositionAction
+    asset_name: robot
+    debug_vis: false
+    clip: null
+    joint_names:
+    - joint7
+    - joint8
+    open_command_expr:
+      joint7: -0.06
+      joint8: 0.06
+    close_command_expr:
+      joint7: 0.0
+      joint8: 0.0
+  left_arm_hold:
+    class_type: isaaclab.envs.mdp.actions.joint_actions:JointPositionAction
+    asset_name: robot
+    debug_vis: false
+    clip: null
+    joint_names:
+    - joint1L
+    - joint2l
+    - joint3l
+    - joint4l
+    - joint5l
+    - joint6l
+    - joint7l
+    - joint8l
+    scale: 0.0
+    offset: 0.0
+    preserve_order: false
+    use_default_offset: true
+events:
+  robot_physics_material:
+    func: isaaclab.envs.mdp.events:randomize_rigid_body_material
+    params:
+      asset_cfg:
+        name: robot
+        joint_names: null
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: .*
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+      static_friction_range: !!python/tuple
+      - 0.8
+      - 1.25
+      dynamic_friction_range: !!python/tuple
+      - 0.8
+      - 1.25
+      restitution_range: !!python/tuple
+      - 0.0
+      - 0.0
+      num_buckets: 16
+    mode: startup
+    interval_range_s: null
+    is_global_time: false
+    min_step_count_between_reset: 0
+  cabinet_physics_material:
+    func: isaaclab.envs.mdp.events:randomize_rigid_body_material
+    params:
+      asset_cfg:
+        name: cabinet
+        joint_names: null
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: drawer_handle_top
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+      static_friction_range: !!python/tuple
+      - 2.0
+      - 2.5
+      dynamic_friction_range: !!python/tuple
+      - 2.0
+      - 2.25
+      restitution_range: !!python/tuple
+      - 0.0
+      - 0.0
+      num_buckets: 16
+    mode: startup
+    interval_range_s: null
+    is_global_time: false
+    min_step_count_between_reset: 0
+  reset_all:
+    func: isaaclab.envs.mdp.events:reset_scene_to_default
+    params: {}
+    mode: reset
+    interval_range_s: null
+    is_global_time: false
+    min_step_count_between_reset: 0
+  reset_robot_joints:
+    func: isaaclab.envs.mdp.events:reset_joints_by_offset
+    params:
+      asset_cfg:
+        name: robot
+        joint_names:
+        - joint1
+        - joint2
+        - joint3
+        - joint4
+        - joint5
+        - joint6
+        - joint7
+        - joint8
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+      position_range: !!python/tuple
+      - -0.1
+      - 0.1
+      velocity_range: !!python/tuple
+      - 0.0
+      - 0.0
+    mode: reset
+    interval_range_s: null
+    is_global_time: false
+    min_step_count_between_reset: 0
+rerender_on_reset: false
+wait_for_textures: true
+is_finite_horizon: false
+episode_length_s: 8.0
+rewards:
+  approach_ee_handle:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:approach_ee_handle
+    params:
+      threshold: 0.2
+    weight: 2.0
+  align_ee_handle:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:align_ee_handle
+    params: {}
+    weight: 0.5
+  approach_gripper_handle:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:approach_gripper_handle
+    params:
+      offset: 0.04
+    weight: 5.0
+  align_grasp_around_handle:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:align_grasp_around_handle
+    params: {}
+    weight: 0.125
+  straddle_handle:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:straddle_handle
+    params:
+      threshold: 0.05
+    weight: 25.0
+  grasp_handle:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:grasp_handle
+    params:
+      threshold: 0.03
+      open_joint_pos: 0.06
+      asset_cfg:
+        name: robot
+        joint_names:
+        - joint7
+        - joint8
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+    weight: 100.0
+  open_drawer_bonus:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:open_drawer_bonus
+    params:
+      asset_cfg:
+        name: cabinet
+        joint_names:
+        - drawer_top_joint
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+    weight: 200.0
+  multi_stage_open_drawer:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:multi_stage_open_drawer
+    params:
+      asset_cfg:
+        name: cabinet
+        joint_names:
+        - drawer_top_joint
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+    weight: 50.0
+  action_rate_l2:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:conditional_action_rate_l2
+    params:
+      asset_cfg:
+        name: cabinet
+        joint_names:
+        - drawer_top_joint
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+    weight: -0.05
+  joint_vel:
+    func: HumanoidRLPackage.HumanoidRLSetup.tasks.bimanual_cabinet.mdp.rewards:conditional_joint_vel_l2
+    params:
+      asset_cfg:
+        name: cabinet
+        joint_names:
+        - drawer_top_joint
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+      robot_cfg:
+        name: robot
+        joint_names:
+        - joint1
+        - joint2
+        - joint3
+        - joint4
+        - joint5
+        - joint6
+        - joint7
+        - joint8
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+    weight: -0.01
+terminations:
+  time_out:
+    func: isaaclab.envs.mdp.terminations:time_out
+    params: {}
+    time_out: true
+  success:
+    func: isaaclab.envs.mdp.terminations:joint_pos_out_of_manual_limit
+    params:
+      asset_cfg:
+        name: cabinet
+        joint_names:
+        - drawer_top_joint
+        joint_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        fixed_tendon_names: null
+        fixed_tendon_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        body_names: null
+        body_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        object_collection_names: null
+        object_collection_ids: !!python/object/apply:builtins.slice
+        - null
+        - null
+        - null
+        preserve_order: false
+      bounds: !!python/tuple
+      - 0.0
+      - 0.39
+    time_out: false
+curriculum: {}
+commands: null
diff --git a/trained_models/bimanual_cabinet/successful_drawer_pull.pt b/trained_models/bimanual_cabinet/successful_drawer_pull.pt
new file mode 100644
index 00000000..45b47131
Binary files /dev/null and b/trained_models/bimanual_cabinet/successful_drawer_pull.pt differ