diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md
index dcd14c59..69b1026d 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md
@@ -5,7 +5,6 @@
 
 ## Task
 20-DOF Wato hand, in-hand cube reorientation (Isaac-Repose-Cube-WatoHand-v0).
-Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm.
 
 ---
 
@@ -29,33 +28,27 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm.
 
 ---
 
-### 3. Stagnation termination fix (WORKS — kept)
-**Problem:** `orientation_error_threshold=0.12` in stagnation but success threshold=0.4 — good episodes ended prematurely.
-**Fix:** Stagnation threshold raised to 0.5 rad, `stagnant_steps` 90 → 150.
-
----
-
-### 4. PPO config tuning (WORKS — kept)
+### 3. PPO config tuning (WORKS — kept)
 - `entropy_coef: 0.002 → 0.0001` — stopped rewarding randomness.
 - `num_steps_per_env: 24 → 48` — better return estimates.
 - `success_bonus weight: 250 → 50` — reduced VF loss spikes.
 
 ---
 
-### 5. Angular velocity toward goal reward (WEAK — kept at low weight)
+### 4. Angular velocity toward goal reward (WEAK — kept at low weight)
 **Idea:** Reward angular velocity component aligned with goal direction.
 **Problem:** First attempt gave negative reward (random spin anti-aligned on average) — suppressed all rotation. Clamped to 0, re-enabled at weight=0.2 once holding stabilized.
 **Outcome:** Stays flat at ~0.016 regardless of policy quality. Goal resampling on success resets the angular velocity alignment, pinning the average near zero. Kept as a weak directional signal only.
 
 ---
 
-### 6. EMA alpha reduction — 0.95 → 0.8 (WORKS)
+### 5. EMA alpha reduction — 0.95 → 0.8 (WORKS)
 **alpha=0.5:** Policy didn't use extra bandwidth. Reverted.
 **alpha=0.8:** Bandwidth ~1 Hz (vs ~0.25 Hz at 0.95). `action_rate_l2` rose -0.13 → -0.30. **Key unlock for rotation.** Orientation error broke below 1.5 consistently.
 
 ---
 
-### 7. Z-axis curriculum (WORKS — converged)
+### 6. Z-axis curriculum (WORKS — converged)
 **Problem:** Full 3D goals too hard to explore. Policy never discovered rotation despite holding.
 **Fix:** `rotation_axes = ["z"]` — goals restricted to palm-normal spin only.
 **Result (5000 iters, 245M steps, 1024 envs):**
@@ -69,7 +62,7 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm.
 
 ---
 
-### 8. Instanceable USD + scaling to 1024 envs (WORKS)
+### 7. Instanceable USD + scaling to 1024 envs (WORKS)
 `replicate_physics = True` was already set. The existing `hand_urdf.usd` has no companion `_meshes.usd` but runs at 1024 envs without OOM — effectively sufficient.
 
 **To make fully instanceable (not yet done):** Isaac Sim GUI → URDF Importer → "Create Instanceable Asset" → produces `hand_urdf.usd` + `instanceable_meshes.usd`. Update `_HAND_USD_PATH` in `wato_hand_cfg.py`.
@@ -81,7 +74,7 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm.
 
 ---
 
-### 9. Reward rebalancing to unblock rotation (WORKS)
+### 8. Reward rebalancing to unblock rotation (WORKS)
 **Problem:** Policy held well (~43%) but didn't rotate — `object_held_bonus` dominated.
 **Changes:**
 - `track_orientation_inv_l2`: weight 5.0 → **10.0**
@@ -92,7 +85,7 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm.
 
 ---
 
-### 10. Full 3D expansion attempt (FAILED — geometric limitation)
+### 9. Full 3D expansion attempt (FAILED — geometric limitation)
 **Setup:** Resumed from z-axis checkpoint (iter 5000) with `rotation_axes = ["x", "y", "z"]`.
 **Observed:** orientation_error jumped 1.0 → 2.18. `action_rate_l2` dropped -0.30 → -0.08. After 500+ iters, no improvement.
 
@@ -105,7 +98,7 @@ Shadow/Allegro hands mount the thumb on the *opposite side* of the palm, giving
 
 ---
 
-### 11. Smoothness fixes — action_rate ×5, joint_vel ×4, alpha 0.85 (WORKS)
+### 10. Smoothness fixes — action_rate ×5, joint_vel ×4, alpha 0.85 (WORKS)
 **Problem:** After goal match, fingers still jerked. Jerking resets `consecutive_success` counter, pinning `max_consecutive_success` at 0.
 **Changes:**
 - `action_rate_l2` weight: -0.01 → **-0.05** (5×)
@@ -116,8 +109,8 @@ Shadow/Allegro hands mount the thumb on the *opposite side* of the palm, giving
 
 ---
 
-### 12. Scale to 2048 envs (MARGINAL IMPROVEMENT — stagnation)
-**Throughput:** 1024 envs ~36k steps/s → 2048 envs ~53k steps/s (1.5× not 2× — GPU near saturation).
+### 11. Scale to 2048 envs (MARGINAL IMPROVEMENT — stagnation)
+**Throughput:** 1024 envs ~36k steps/s → 2048 envs ~53k steps/s
 **VF loss:** Improved from 150-250 range down to 82-157 — bigger batch gives better return estimates.
 **Orientation error:** Oscillates 0.94-1.19, best mean batch 0.946. No sustained improvement beyond the ~1.0 floor.
 
@@ -157,10 +150,7 @@ Shadow/Allegro hands mount the thumb on the *opposite side* of the palm, giving
 
 1. **Break the stagnation floor (~1.0 rad)** — options:
    - Much stronger smoothness: `action_rate_l2` weight -0.05 → -0.15 to finally kill jerking. Risk: may suppress rotation bandwidth.
-   - Fresh training run with all current hyperparams — policy may have converged to a poor attractor that a new random init escapes.
 
 2. **Full 3D reorientation** requires hardware change: reposition thumb to opposite side of palm (true opposition), or mount hand palm-sideways so z-axis becomes a tilt direction. More compute will not overcome the geometric limitation.
 
-3. **Fully instanceable USD** — create proper `instanceable_meshes.usd` via Isaac Sim GUI for cleaner multi-env physics sharing.
-
-4. **Scale compute** — 4096+ envs (multi-GPU) is the next throughput step if hardware is available.
+3. **Scale compute** — 4096+ envs (multi-GPU) is the next throughput step if hardware is available.
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py
index 2f2a2649..71e86878 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py
@@ -7,7 +7,7 @@
     entry_point="isaaclab.envs:ManagerBasedRLEnv",
     disable_env_checker=True,
     kwargs={
-        "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeEnvCfg",
+        "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeEnvCfg",
         "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubePPORunnerCfg",
     },
 )
@@ -17,7 +17,7 @@
     entry_point="isaaclab.envs:ManagerBasedRLEnv",
     disable_env_checker=True,
     kwargs={
-        "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeEnvCfg_PLAY",
+        "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeEnvCfg_PLAY",
         "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubePPORunnerCfg",
     },
 )
@@ -27,7 +27,7 @@
     entry_point="isaaclab.envs:ManagerBasedRLEnv",
     disable_env_checker=True,
     kwargs={
-        "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeNoVelObsEnvCfg",
+        "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeNoVelObsEnvCfg",
         "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubeNoVelObsPPORunnerCfg",
     },
 )
@@ -37,7 +37,7 @@
     entry_point="isaaclab.envs:ManagerBasedRLEnv",
     disable_env_checker=True,
     kwargs={
-        "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeNoVelObsEnvCfg_PLAY",
+        "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeNoVelObsEnvCfg_PLAY",
         "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubeNoVelObsPPORunnerCfg",
     },
 )
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_hand_env_cfg.py
similarity index 78%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_env_cfg.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_hand_env_cfg.py
index 98ae49c7..55c508ad 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_env_cfg.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_hand_env_cfg.py
@@ -3,11 +3,7 @@
 
 import HumanoidRLPackage.HumanoidRLSetup.tasks.inhand.inhand_env_cfg as inhand_env_cfg
 import HumanoidRLPackage.HumanoidRLSetup.tasks.inhand.mdp as inhand_mdp
-from HumanoidRLPackage.HumanoidRLSetup.modelCfg.wato_hand import (
-    INHAND_WATO_HAND_CFG,
-    INHAND_CUBE_POS,
-    INHAND_SPREAD_RAD,
-)
+from HumanoidRLPackage.HumanoidRLSetup.modelCfg.wato_hand import INHAND_SPREAD_RAD
 
 
 @configclass
@@ -17,22 +13,15 @@ class WatoHandCubeEnvCfg(inhand_env_cfg.InHandObjectEnvCfg):
     def __post_init__(self):
         super().__post_init__()
 
-        # Share physics across envs — lower RAM per env, scales to more parallel rollouts.
         self.scene.replicate_physics = True
         self.scene.num_envs = 2048
 
-        self.scene.robot = INHAND_WATO_HAND_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
         # Push expanded MCP_A limits (±27 deg) into PhysX, overriding the ±8.6 deg baked in the USD.
         self.events.expand_abduction_limits = EventTerm(
             func=inhand_mdp.apply_wato_hand_joint_limits,
             mode="startup",
         )
 
-        # change the cube on plam's property
-        self.scene.object.spawn.scale = (0.8, 0.8, 0.8)
-        self.scene.object.init_state.pos = INHAND_CUBE_POS
-        self.scene.object.init_state.rot = (1.0, 0.0, 0.0, 0.0)
-
         _grasp_scale = [0.2, 0.2]
         _splay_scale = [1.0, 1.0]
         self.events.reset_robot_joints.params["position_range"] = {
@@ -62,7 +51,7 @@ def __post_init__(self):
         # Full 3D random orientation is too hard to explore from scratch; z-axis
         # rotation (spinning in the palm plane) is the most natural motion for this hand.
         # Once orientation_error shows a downward trend, expand back to ["x", "y"].
-        self.commands.object_pose.rotation_axes = ["z"]
+        self.commands.cube_pose.rotation_axes = ["z"]
 
         # Small bonus for MCP_A velocity + spread deflection.
         self.rewards.spread_activity = RewTerm(
@@ -81,12 +70,13 @@ def __post_init__(self):
         super().__post_init__()
         self.scene.num_envs = 16
         self.observations.policy.enable_corruption = False
-        # Keep time_out so play episodes reset instead of clamping forever.
+        # Keep time_out so play episodes reset faster to see the next set
         self.episode_length_s = 10.0
         # Nudge goal marker aside so it does not cover the physical cube.
-        self.commands.object_pose.marker_pos_offset = (-0.10, 0.0, 0.12)
+        self.commands.cube_pose.marker_pos_offset = (-0.10, 0.0, 0.12)
 
 
+# Version with no velocity observation as input because in a real deployment, measuring velocity is noisy / unavailable
 @configclass
 class WatoHandCubeNoVelObsEnvCfg(WatoHandCubeEnvCfg):
     def __post_init__(self):
@@ -100,7 +90,5 @@ def __post_init__(self):
         super().__post_init__()
         self.scene.num_envs = 16
         self.observations.policy.enable_corruption = False
-        # Keep time_out so play episodes reset instead of clamping forever.
         self.episode_length_s = 10.0
-        # Nudge goal marker aside so it does not cover the physical cube.
-        self.commands.object_pose.marker_pos_offset = (-0.10, 0.0, 0.12)
+        self.commands.cube_pose.marker_pos_offset = (-0.10, 0.0, 0.12)
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand.md b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand.md
new file mode 100644
index 00000000..f6518e01
--- /dev/null
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand.md
@@ -0,0 +1,88 @@
+# In-Hand: Wato hand cube reorientation
+
+In-hand dexterous manipulation for the 20-DOF Wato hand (`hand_urdf.usd`) in Isaac Lab. The policy reorients a DexCube held in the palm toward a commanded goal orientation. **Play** shows a ghost goal cube marker (enabled via `WatoHandCubeEnvCfg_PLAY`).
+
+Task setup and MDP code are adapted from [Isaac Lab](https://github.com/isaac-sim/IsaacLab) in-hand manipulation examples. Experimentation history: `TRAINING_LOG.md`.
+
+**Environments**
+
+| Task ID | Scene | Mode |
+| :--- | :--- | :--- |
+| `Isaac-Repose-Cube-WatoHand-v0` | Palm + DexCube | Train |
+| `Isaac-Repose-Cube-WatoHand-Play-v0` | Palm + DexCube | Play |
+| `Isaac-Repose-Cube-WatoHand-NoVelObs-v0` | Palm + DexCube | Train (no velocity obs) |
+| `Isaac-Repose-Cube-WatoHand-NoVelObs-Play-v0` | Palm + DexCube | Play (no velocity obs) |
+
+## Train & play
+
+Run from `HumanoidRL/` (the directory that contains `HumanoidRLPackage/`):
+
+```bash
+# Train (default 2048 envs; try 1024 or 512 if OOM)
+PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/train.py \
+  --task=Isaac-Repose-Cube-WatoHand-v0 --headless
+
+# Play — omit --headless to see goal-orientation marker
+PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/play.py \
+  --task=Isaac-Repose-Cube-WatoHand-Play-v0 --num_envs=1
+
+# Play — specific checkpoint
+PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/play.py \
+  --task=Isaac-Repose-Cube-WatoHand-Play-v0 --num_envs=1 \
+  --checkpoint logs/rsl_rl/wato_hand_cube/<run>/model_<iter>.pt
+```
+
+Checkpoints: `logs/rsl_rl/wato_hand_cube/`. PPO defaults: `max_iterations=5000`, `experiment_name=wato_hand_cube` (`config/wato_hand/agents/rsl_rl_ppo_cfg.py`).
+
+## Scene & command
+
+| Item | Value |
+| :--- | :--- |
+| Robot | `INHAND_WATO_HAND_CFG` (`modelCfg/wato_hand.py`) — palm-up, 20 DOF |
+| Cube | Isaac Nucleus DexCube (instanceable), scale `(0.8, 0.8, 0.8)`, spawn `INHAND_CUBE_POS` |
+| Goal command | `InHandReOrientationCommand` — resampled **on success** (not on a timer) |
+| Goal position | Default cube spawn + `init_pos_offset = (0, 0, -0.04)` m (hold-in-palm target) |
+| Goal orientation | Sampled on allowed `rotation_axes`; success when error **< 0.4 rad** |
+| Curriculum (current) | `rotation_axes = ["z"]` — palm-normal spin only (`wato_hand_env_cfg.py`) |
+| Startup event | `expand_abduction_limits` — overrides USD-baked MCP_A limits to ±27° |
+
+`replicate_physics=True` on Wato hand (instanceable-friendly).
+
+## Reward
+
+All base reward terms are defined in `RewardsCfg` (`inhand_env_cfg.py`). Implementations live in `mdp/rewards.py` (task-specific) and Isaac Lab `mdp` (generic penalties). `WatoHandCubeEnvCfg` adds one extra term — `spread_activity` — in `config/wato_hand/wato_hand_env_cfg.py`.
+
+| Category | Reward Function | Weight | Description |
+| :--- | :--- | :--- | :--- |
+| **Task** | Position tracking (`track_pos_l2`) | −3.0 | L2 distance to goal position (encourages holding cube in palm). |
+| | Orientation tracking (`track_orientation_inv_l2`) | 10.0 | $1 / (\text{orientation\_error} + 0.1)$ — dense rotation signal. |
+| | Success bonus (`success_bonus`) | 50.0 | Binary reward when orientation error **< 0.4 rad**. |
+| | Object held bonus (`object_held_bonus`) | 0.5 | +1/step when cube within **0.10 m** of goal position. |
+| | Angular velocity toward goal (`object_ang_vel_toward_goal`) | 0.2 | Positive component of cube spin aligned with goal (clamped ≥ 0). |
+| | Spread activity (`mcp_a_spread_activity`) | 0.03 | Encourages MCP_A abduction velocity + deflection (Wato only). |
+| **Penalties** | Object away (`object_away_penalty`) | −5.0 | Terminal penalty when `object_out_of_reach` fires. |
+| | Action rate L2 | −0.05 | Penalises jerky finger commands. |
+| | Joint velocity L2 | $-1.0 \times 10^{-4}$ | Penalises high joint speeds. |
+| | Action L2 | $-1.0 \times 10^{-4}$ | Penalises large action magnitudes. |
+
+## Terminations
+
+| Termination | Condition |
+| :--- | :--- |
+| Time out | Episode length exceeds **20 s** (10 s in Play). |
+| Max consecutive success | Goal reached **50 times** in a row within one episode. |
+| Object out of reach | Cube drifts **> 0.3 m** from robot root. |
+| Orientation stagnation | Orientation error stays **> 0.5 rad** for **150** consecutive steps. |
+
+## Sim settings
+
+| Setting | Value |
+| :--- | :--- |
+| `decimation` | 4 |
+| `sim.dt` | 1/120 s (~120 Hz) |
+| `episode_length_s` | 20.0 |
+| Default `num_envs` | 2048 |
+| Action smoothing | EMA joint-position targets, `alpha = 0.85` |
+| PPO `num_steps_per_env` | 48 |
+| PPO `gamma` | 0.998 |
+| PPO `entropy_coef` | 0.0001 |
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py
index 35beedf8..cca88f0c 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py
@@ -1,7 +1,5 @@
 from __future__ import annotations
 
-from dataclasses import MISSING
-
 import isaaclab.sim as sim_utils
 from isaaclab.assets import ArticulationCfg, AssetBaseCfg, RigidObjectCfg
 from isaaclab.envs import ManagerBasedRLEnvCfg
@@ -19,17 +17,19 @@
 from isaaclab.utils.noise import AdditiveGaussianNoiseCfg as Gnoise
 
 import HumanoidRLPackage.HumanoidRLSetup.tasks.inhand.mdp as mdp
+from HumanoidRLPackage.HumanoidRLSetup.modelCfg.wato_hand import INHAND_CUBE_POS, INHAND_WATO_HAND_CFG
 
 
 
 @configclass
 class InHandObjectSceneCfg(InteractiveSceneCfg):
-    robot: ArticulationCfg = MISSING
+    robot: ArticulationCfg = INHAND_WATO_HAND_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
 
-    object: RigidObjectCfg = RigidObjectCfg(
-        prim_path="{ENV_REGEX_NS}/object",
+    cube: RigidObjectCfg = RigidObjectCfg(
+        prim_path="{ENV_REGEX_NS}/cube",
         spawn=sim_utils.UsdFileCfg(
             usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Blocks/DexCube/dex_cube_instanceable.usd",
+            scale=(0.8, 0.8, 0.8),
             rigid_props=sim_utils.RigidBodyPropertiesCfg(
                 kinematic_enabled=False,
                 disable_gravity=False,
@@ -42,7 +42,7 @@ class InHandObjectSceneCfg(InteractiveSceneCfg):
             ),
             mass_props=sim_utils.MassPropertiesCfg(density=400.0),
         ),
-        init_state=RigidObjectCfg.InitialStateCfg(pos=(0.0, -0.19, 0.56), rot=(1.0, 0.0, 0.0, 0.0)),
+        init_state=RigidObjectCfg.InitialStateCfg(pos=INHAND_CUBE_POS, rot=(1.0, 0.0, 0.0, 0.0)),
     )
 
     light = AssetBaseCfg(
@@ -59,7 +59,7 @@ class InHandObjectSceneCfg(InteractiveSceneCfg):
 
 @configclass
 class CommandsCfg:
-    object_pose = mdp.InHandReOrientationCommandCfg()
+    cube_pose = mdp.InHandReOrientationCommandCfg()
 
 
 @configclass
@@ -88,26 +88,26 @@ class KinematicObsGroupCfg(ObsGroup):
 
         # -- object terms
         object_pos = ObsTerm(
-            func=mdp.root_pos_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("object")}
+            func=mdp.root_pos_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("cube")}
         )
         object_quat = ObsTerm(
-            func=mdp.root_quat_w, params={"asset_cfg": SceneEntityCfg("object"), "make_quat_unique": False}
+            func=mdp.root_quat_w, params={"asset_cfg": SceneEntityCfg("cube"), "make_quat_unique": False}
         )
         object_lin_vel = ObsTerm(
-            func=mdp.root_lin_vel_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("object")}
+            func=mdp.root_lin_vel_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("cube")}
         )
         object_ang_vel = ObsTerm(
             func=mdp.root_ang_vel_w,
             scale=0.2,
             noise=Gnoise(std=0.002),
-            params={"asset_cfg": SceneEntityCfg("object")},
+            params={"asset_cfg": SceneEntityCfg("cube")},
         )
 
         # -- command terms
-        goal_pose = ObsTerm(func=mdp.generated_commands, params={"command_name": "object_pose"})
+        goal_pose = ObsTerm(func=mdp.generated_commands, params={"command_name": "cube_pose"})
         goal_quat_diff = ObsTerm(
             func=mdp.goal_quat_diff,
-            params={"asset_cfg": SceneEntityCfg("object"), "command_name": "object_pose", "make_quat_unique": False},
+            params={"asset_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose", "make_quat_unique": False},
         )
 
         # -- action terms
@@ -177,7 +177,7 @@ class EventCfg:
         func=mdp.randomize_rigid_body_material,
         mode="startup",
         params={
-            "asset_cfg": SceneEntityCfg("object"),
+            "asset_cfg": SceneEntityCfg("cube"),
             "static_friction_range": (0.7, 1.3),
             "dynamic_friction_range": (0.7, 1.3),
             "restitution_range": (0.0, 0.0),
@@ -188,7 +188,7 @@ class EventCfg:
         func=mdp.randomize_rigid_body_mass,
         mode="startup",
         params={
-            "asset_cfg": SceneEntityCfg("object"),
+            "asset_cfg": SceneEntityCfg("cube"),
             "mass_distribution_params": (0.4, 1.6),
             "operation": "scale",
         },
@@ -201,7 +201,7 @@ class EventCfg:
         params={
             "pose_range": {"x": [-0.01, 0.01], "y": [-0.01, 0.01], "z": [-0.01, 0.01]},
             "velocity_range": {},
-            "asset_cfg": SceneEntityCfg("object"),
+            "asset_cfg": SceneEntityCfg("cube"),
         },
     )
     reset_robot_joints = EventTerm(
@@ -221,17 +221,17 @@ class RewardsCfg:
     track_pos_l2 = RewTerm(
         func=mdp.track_pos_l2,
         weight=-3.0,
-        params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose"},
+        params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose"},
     )
     track_orientation_inv_l2 = RewTerm(
         func=mdp.track_orientation_inv_l2,
         weight=10.0,
-        params={"object_cfg": SceneEntityCfg("object"), "rot_eps": 0.1, "command_name": "object_pose"},
+        params={"object_cfg": SceneEntityCfg("cube"), "rot_eps": 0.1, "command_name": "cube_pose"},
     )
     success_bonus = RewTerm(
         func=mdp.success_bonus,
         weight=50.0,
-        params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose"},
+        params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose"},
     )
 
     # -- penalties
@@ -245,13 +245,13 @@ class RewardsCfg:
     object_held_bonus = RewTerm(
         func=mdp.object_held_bonus,
         weight=0.5,
-        params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose", "hold_threshold": 0.10},
+        params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose", "hold_threshold": 0.10},
     )
 
     object_ang_vel_toward_goal = RewTerm(
         func=mdp.object_ang_vel_toward_goal,
         weight=0.2,
-        params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose"},
+        params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose"},
     )
 
     # Penalty for dropping the object — critical for early training.
@@ -267,7 +267,7 @@ class TerminationsCfg:
     time_out = DoneTerm(func=mdp.time_out, time_out=True)
 
     max_consecutive_success = DoneTerm(
-        func=mdp.max_consecutive_success, params={"num_success": 50, "command_name": "object_pose"}
+        func=mdp.max_consecutive_success, params={"num_success": 50, "command_name": "cube_pose"}
     )
 
     object_out_of_reach = DoneTerm(func=mdp.object_away_from_robot, params={"threshold": 0.3})
@@ -278,7 +278,7 @@ class TerminationsCfg:
     orientation_stagnation = DoneTerm(
         func=mdp.orientation_stagnation,
         params={
-            "command_name": "object_pose",
+            "command_name": "cube_pose",
             "orientation_error_threshold": 0.5,
             "stagnant_steps": 150,
         },
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py
index 6ba856d4..6804792e 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py
@@ -9,15 +9,10 @@
 
 @configclass
 class InHandReOrientationCommandCfg(CommandTermCfg):
-    """Configuration for the uniform 3D orientation command term.
-
-    Please refer to the :class:`InHandReOrientationCommand` class for more details.
-    """
-
     class_type: type = InHandReOrientationCommand
     resampling_time_range: tuple[float, float] = (1e6, 1e6)  # no resampling based on time
 
-    asset_name: str = "object"
+    asset_name: str = "cube"
     """Scene asset that receives orientation goals (the manipuland rigid body)."""
 
     init_pos_offset: tuple[float, float, float] = (0.0, 0.0, -0.04)
@@ -44,6 +39,7 @@ class InHandReOrientationCommandCfg(CommandTermCfg):
 
     debug_vis: bool = True
 
+    """Goal-orientation debug marker (DexCube visual only)."""
     goal_pose_visualizer_cfg: VisualizationMarkersCfg = VisualizationMarkersCfg(
         prim_path="/Visuals/Command/goal_marker",
         markers={
@@ -53,4 +49,4 @@ class InHandReOrientationCommandCfg(CommandTermCfg):
             ),
         },
     )
-    """Goal-orientation debug marker (DexCube visual only)."""
\ No newline at end of file
+    
\ No newline at end of file
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py
index 56f2a56d..53a78f29 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py
@@ -24,27 +24,13 @@ class InHandReOrientationCommand(CommandTerm):
 
     The constant position commands is to encourage that the object does not move during the task.
     For instance, the object should not fall off the robot's palm.
-
-    Unlike typical command terms, where the goals are resampled based on time, this command term
-    does not resample the goals based on time. Instead, the goals are resampled when the object
-    reaches the goal orientation. The goal orientation is considered to be reached when the
-    orientation error is below a certain threshold.
     """
 
     cfg: InHandReOrientationCommandCfg
-    """Configuration for the command term."""
 
     def __init__(self, cfg: InHandReOrientationCommandCfg, env: ManagerBasedRLEnv):
-        """Initialize the command term class.
-
-        Args:
-            cfg: The configuration parameters for the command term.
-            env: The environment object.
-        """
-        # initialize the base class
         super().__init__(cfg, env)
 
-        # object
         self.object: RigidObject = env.scene[cfg.asset_name]
 
         # create buffers to store the command
@@ -80,10 +66,6 @@ def command(self) -> torch.Tensor:
         """The desired goal pose in the environment frame. Shape is (num_envs, 7)."""
         return torch.cat((self.pos_command_e, self.quat_command_w), dim=-1)
 
-    """
-    Implementation specific functions.
-    """
-
     def _update_metrics(self):
         # logs data
         # -- compute the orientation error
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py
index 6b330db4..b9534bda 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py
@@ -13,7 +13,7 @@
 
 
 def success_bonus(
-    env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("object")
+    env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("cube")
 ) -> torch.Tensor:
     """Bonus reward for successfully reaching the goal.
 
@@ -23,7 +23,7 @@ def success_bonus(
     Args:
         env: The environment object.
         command_name: The command term to be used for extracting the goal.
-        object_cfg: The configuration for the scene entity. Default is "object".
+        object_cfg: The configuration for the scene entity. Default is "cube".
     """
     # extract useful elements
     asset: RigidObject = env.scene[object_cfg.name]
@@ -40,7 +40,7 @@ def success_bonus(
 
 
 def track_pos_l2(
-    env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("object")
+    env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("cube")
 ) -> torch.Tensor:
     """Reward for tracking the object position using the L2 norm.
 
@@ -49,7 +49,7 @@ def track_pos_l2(
     Args:
         env: The environment object.
         command_term: The command term to be used for extracting the goal.
-        object_cfg: The configuration for the scene entity. Default is "object".
+        object_cfg: The configuration for the scene entity. Default is "cube".
     """
     # extract useful elements
     asset: RigidObject = env.scene[object_cfg.name]
@@ -66,7 +66,7 @@ def track_pos_l2(
 def track_orientation_inv_l2(
     env: ManagerBasedRLEnv,
     command_name: str,
-    object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
+    object_cfg: SceneEntityCfg = SceneEntityCfg("cube"),
     rot_eps: float = 1e-3,
 ) -> torch.Tensor:
     """Reward for tracking the object orientation using the inverse of the orientation error.
@@ -76,7 +76,7 @@ def track_orientation_inv_l2(
     Args:
         env: The environment object.
         command_name: The command term to be used for extracting the goal.
-        object_cfg: The configuration for the scene entity. Default is "object".
+        object_cfg: The configuration for the scene entity. Default is "cube".
         rot_eps: The threshold for the orientation error. Default is 1e-3.
     """
     asset: RigidObject = env.scene[object_cfg.name]
@@ -92,7 +92,7 @@ def track_orientation_inv_l2(
 def object_ang_vel_toward_goal(
     env: ManagerBasedRLEnv,
     command_name: str,
-    object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
+    object_cfg: SceneEntityCfg = SceneEntityCfg("cube"),
 ) -> torch.Tensor:
     """Reward the component of object angular velocity that reduces orientation error.
 
@@ -121,7 +121,7 @@ def object_ang_vel_toward_goal(
 def object_held_bonus(
     env: ManagerBasedRLEnv,
     command_name: str,
-    object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
+    object_cfg: SceneEntityCfg = SceneEntityCfg("cube"),
     hold_threshold: float = 0.10,
 ) -> torch.Tensor:
     """Per-step bonus (1.0) when object is within hold_threshold of the goal position.
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py
index a09fe9f4..3a8a2a15 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py
@@ -31,7 +31,7 @@ def object_away_from_goal(
     env: ManagerBasedRLEnv,
     threshold: float,
     command_name: str,
-    object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
+    object_cfg: SceneEntityCfg = SceneEntityCfg("cube"),
 ) -> torch.Tensor:
     """Check if object has gone far from the goal.
 
@@ -42,7 +42,7 @@ def object_away_from_goal(
         env: The environment object.
         threshold: The threshold for the distance between the robot and the object.
         command_name: The command term to be used for extracting the goal.
-        object_cfg: The configuration for the scene entity. Default is "object".
+        object_cfg: The configuration for the scene entity. Default is "cube".
     """
     # extract useful elements
     command_term: InHandReOrientationCommand = env.command_manager.get_term(command_name)
@@ -59,7 +59,7 @@ def object_away_from_robot(
     env: ManagerBasedRLEnv,
     threshold: float,
     asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
-    object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
+    object_cfg: SceneEntityCfg = SceneEntityCfg("cube"),
 ) -> torch.Tensor:
     """Check if object has gone far from the robot.
 
@@ -70,7 +70,7 @@ def object_away_from_robot(
         env: The environment object.
         threshold: The threshold for the distance between the robot and the object.
         asset_cfg: The configuration for the robot entity. Default is "robot".
-        object_cfg: The configuration for the object entity. Default is "object".
+        object_cfg: The configuration for the object entity. Default is "cube".
     """
     # extract useful elements
     robot = env.scene[asset_cfg.name]
@@ -89,7 +89,7 @@ def orientation_stagnation(
     stagnant_steps: int,
     object_ang_vel_threshold: float | None = None,
     joint_vel_threshold: float | None = None,
-    object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
+    object_cfg: SceneEntityCfg = SceneEntityCfg("cube"),
     robot_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
 ) -> torch.Tensor:
     """Terminate when orientation error stays above threshold for too many steps.
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/__init__.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/__init__.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/__init__.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/__init__.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/__init__.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/__init__.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/__init__.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/__init__.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/__init__.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/joint_pos_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/joint_pos_env_cfg.py
similarity index 90%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/joint_pos_env_cfg.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/joint_pos_env_cfg.py
index f587e9bc..c38c3e61 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/joint_pos_env_cfg.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/joint_pos_env_cfg.py
@@ -1,5 +1,5 @@
 from isaaclab.utils import configclass
-from HumanoidRLPackage.HumanoidRLSetup.tasks.manipulation.reach_env_cfg import ReachEnvCfg
+from HumanoidRLPackage.HumanoidRLSetup.tasks.reach.reach_env_cfg import ReachEnvCfg
 
 
 @configclass
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/__init__.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/__init__.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/__init__.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/__init__.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/__init__.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/__init__.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/rewards.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/rewards.py
similarity index 100%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/rewards.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/rewards.py
diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/reach_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/reach_env_cfg.py
similarity index 98%
rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/reach_env_cfg.py
rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/reach_env_cfg.py
index 664cd88b..b9f9774d 100644
--- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/reach_env_cfg.py
+++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/reach_env_cfg.py
@@ -12,7 +12,7 @@
 from isaaclab.utils import configclass
 from isaaclab.utils.noise import AdditiveUniformNoiseCfg as Unoise
 
-import HumanoidRLPackage.HumanoidRLSetup.tasks.manipulation.mdp as mdp
+import HumanoidRLPackage.HumanoidRLSetup.tasks.reach.mdp as mdp
 from HumanoidRLPackage.HumanoidRLSetup.modelCfg.humanoid_arm_hand import ARM_CFG
 
 
diff --git a/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py b/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py
index a8a16ecd..b46b9b9c 100644
--- a/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py
+++ b/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py
@@ -1,14 +1,4 @@
-"""
-20-DOF Wato hand articulation config.
-
-Robot model: Humanoid_Wato/wato_hand (hand_urdf.usd / hand_urdf.urdf)
-Joint limits: Isaac Sim Physics Inspector (/World/hand_urdf/hand_origin)
-Finger splay (MCP_A_1..4): ±8.594 deg (±0.15 rad) per joint.
-
-    import sys, os
-    sys.path.insert(0, os.path.join(<repo>, "autonomy/simulation/Humanoid_Wato/wato_hand"))
-    from wato_hand_cfg import WATO_HAND_CFG, JOINT_POS_LIMITS, apply_joint_limits
-"""
+"""20-DOF Wato hand articulation config."""
 import math
 import os
 
@@ -25,8 +15,8 @@ def _deg(degrees: float) -> float:
     return degrees * math.pi / 180.0
 
 
-# --- Joint limits from Physics Inspector (degrees -> rad) --------------------
-_MCP_A_LIMIT = (_deg(-27.0), _deg(27.0))  # expanded to AllegroHand range; self-collision is off
+# --- Joint limits (degrees -> rad) --------------------
+_MCP_A_LIMIT = (_deg(-27.0), _deg(27.0))  # expanded to wider range for now; self-collision is off
 _MCP_FLEX_LIMIT = (0.0, _deg(85.944))
 _PIP_FLEX_LIMIT = (_deg(-85.944), 0.0)
 _DIP_FLEX_LIMIT = (0.0, _deg(85.944))
@@ -54,7 +44,7 @@ def _deg(degrees: float) -> float:
     "MCP_2": _MCP_FLEX_LIMIT,
     "PIP_2": _PIP_FLEX_LIMIT,
     "DIP_2": _DIP_FLEX_LIMIT,
-    # Ring (finger 3) — slightly different flex range
+    # Ring (finger 3)
     "MCP_A_3": _MCP_A_LIMIT,
     "MCP_3": _MCP_FLEX_LIMIT_RING,
     "PIP_3": _PIP_FLEX_LIMIT_RING,
@@ -66,7 +56,6 @@ def _deg(degrees: float) -> float:
     "DIP_4": _DIP_FLEX_LIMIT_RING,
 }
 
-# Ordered like config/joint_names_hand_urdf.yaml (excluding the empty root slot).
 ALL_JOINT_NAMES = [
     "circumduction",
     "MCP_A_thumb",
@@ -90,7 +79,6 @@ def _deg(degrees: float) -> float:
     "DIP_4",
 ]
 
-# All 20 revolute DOF (matches config/joint_names_hand_urdf.yaml).
 ACTUATED_JOINT_NAMES = list(ALL_JOINT_NAMES)
 
 _DEFAULT_JOINT_POS = {name: 0.0 for name in ALL_JOINT_NAMES}
diff --git a/docker/simulation/isaac_il/QUICKSTART.md b/docker/simulation/isaac_il/QUICKSTART.md
index 577bf5bb..b28e2aaf 100644
--- a/docker/simulation/isaac_il/QUICKSTART.md
+++ b/docker/simulation/isaac_il/QUICKSTART.md
@@ -1,4 +1,4 @@
-# SO101 sim IL — copy-paste commands
+# SO101 sim IL
 
 Exact workflow used for **HF dataset → ACT train → sim eval** (no physical arm).
 Stack: Isaac Lab 2.3.2 / Sim 5.1 / LeRobot 0.4.3 / ACT / watod `simulation_il`.