diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md index dcd14c59..69b1026d 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/TRAINING_LOG.md @@ -5,7 +5,6 @@ ## Task 20-DOF Wato hand, in-hand cube reorientation (Isaac-Repose-Cube-WatoHand-v0). -Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm. --- @@ -29,33 +28,27 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm. --- -### 3. Stagnation termination fix (WORKS — kept) -**Problem:** `orientation_error_threshold=0.12` in stagnation but success threshold=0.4 — good episodes ended prematurely. -**Fix:** Stagnation threshold raised to 0.5 rad, `stagnant_steps` 90 → 150. - ---- - -### 4. PPO config tuning (WORKS — kept) +### 3. PPO config tuning (WORKS — kept) - `entropy_coef: 0.002 → 0.0001` — stopped rewarding randomness. - `num_steps_per_env: 24 → 48` — better return estimates. - `success_bonus weight: 250 → 50` — reduced VF loss spikes. --- -### 5. Angular velocity toward goal reward (WEAK — kept at low weight) +### 4. Angular velocity toward goal reward (WEAK — kept at low weight) **Idea:** Reward angular velocity component aligned with goal direction. **Problem:** First attempt gave negative reward (random spin anti-aligned on average) — suppressed all rotation. Clamped to 0, re-enabled at weight=0.2 once holding stabilized. **Outcome:** Stays flat at ~0.016 regardless of policy quality. Goal resampling on success resets the angular velocity alignment, pinning the average near zero. Kept as a weak directional signal only. --- -### 6. EMA alpha reduction — 0.95 → 0.8 (WORKS) +### 5. EMA alpha reduction — 0.95 → 0.8 (WORKS) **alpha=0.5:** Policy didn't use extra bandwidth. Reverted. **alpha=0.8:** Bandwidth ~1 Hz (vs ~0.25 Hz at 0.95). `action_rate_l2` rose -0.13 → -0.30. **Key unlock for rotation.** Orientation error broke below 1.5 consistently. --- -### 7. Z-axis curriculum (WORKS — converged) +### 6. Z-axis curriculum (WORKS — converged) **Problem:** Full 3D goals too hard to explore. Policy never discovered rotation despite holding. **Fix:** `rotation_axes = ["z"]` — goals restricted to palm-normal spin only. **Result (5000 iters, 245M steps, 1024 envs):** @@ -69,7 +62,7 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm. --- -### 8. Instanceable USD + scaling to 1024 envs (WORKS) +### 7. Instanceable USD + scaling to 1024 envs (WORKS) `replicate_physics = True` was already set. The existing `hand_urdf.usd` has no companion `_meshes.usd` but runs at 1024 envs without OOM — effectively sufficient. **To make fully instanceable (not yet done):** Isaac Sim GUI → URDF Importer → "Create Instanceable Asset" → produces `hand_urdf.usd` + `instanceable_meshes.usd`. Update `_HAND_USD_PATH` in `wato_hand_cfg.py`. @@ -81,7 +74,7 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm. --- -### 9. Reward rebalancing to unblock rotation (WORKS) +### 8. Reward rebalancing to unblock rotation (WORKS) **Problem:** Policy held well (~43%) but didn't rotate — `object_held_bonus` dominated. **Changes:** - `track_orientation_inv_l2`: weight 5.0 → **10.0** @@ -92,7 +85,7 @@ Palm-up orientation, cube spawned at `(-0.01, 0.09, 0.5)` in the palm. --- -### 10. Full 3D expansion attempt (FAILED — geometric limitation) +### 9. Full 3D expansion attempt (FAILED — geometric limitation) **Setup:** Resumed from z-axis checkpoint (iter 5000) with `rotation_axes = ["x", "y", "z"]`. **Observed:** orientation_error jumped 1.0 → 2.18. `action_rate_l2` dropped -0.30 → -0.08. After 500+ iters, no improvement. @@ -105,7 +98,7 @@ Shadow/Allegro hands mount the thumb on the *opposite side* of the palm, giving --- -### 11. Smoothness fixes — action_rate ×5, joint_vel ×4, alpha 0.85 (WORKS) +### 10. Smoothness fixes — action_rate ×5, joint_vel ×4, alpha 0.85 (WORKS) **Problem:** After goal match, fingers still jerked. Jerking resets `consecutive_success` counter, pinning `max_consecutive_success` at 0. **Changes:** - `action_rate_l2` weight: -0.01 → **-0.05** (5×) @@ -116,8 +109,8 @@ Shadow/Allegro hands mount the thumb on the *opposite side* of the palm, giving --- -### 12. Scale to 2048 envs (MARGINAL IMPROVEMENT — stagnation) -**Throughput:** 1024 envs ~36k steps/s → 2048 envs ~53k steps/s (1.5× not 2× — GPU near saturation). +### 11. Scale to 2048 envs (MARGINAL IMPROVEMENT — stagnation) +**Throughput:** 1024 envs ~36k steps/s → 2048 envs ~53k steps/s **VF loss:** Improved from 150-250 range down to 82-157 — bigger batch gives better return estimates. **Orientation error:** Oscillates 0.94-1.19, best mean batch 0.946. No sustained improvement beyond the ~1.0 floor. @@ -157,10 +150,7 @@ Shadow/Allegro hands mount the thumb on the *opposite side* of the palm, giving 1. **Break the stagnation floor (~1.0 rad)** — options: - Much stronger smoothness: `action_rate_l2` weight -0.05 → -0.15 to finally kill jerking. Risk: may suppress rotation bandwidth. - - Fresh training run with all current hyperparams — policy may have converged to a poor attractor that a new random init escapes. 2. **Full 3D reorientation** requires hardware change: reposition thumb to opposite side of palm (true opposition), or mount hand palm-sideways so z-axis becomes a tilt direction. More compute will not overcome the geometric limitation. -3. **Fully instanceable USD** — create proper `instanceable_meshes.usd` via Isaac Sim GUI for cleaner multi-env physics sharing. - -4. **Scale compute** — 4096+ envs (multi-GPU) is the next throughput step if hardware is available. +3. **Scale compute** — 4096+ envs (multi-GPU) is the next throughput step if hardware is available. diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py index 2f2a2649..71e86878 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/__init__.py @@ -7,7 +7,7 @@ entry_point="isaaclab.envs:ManagerBasedRLEnv", disable_env_checker=True, kwargs={ - "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeEnvCfg", + "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeEnvCfg", "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubePPORunnerCfg", }, ) @@ -17,7 +17,7 @@ entry_point="isaaclab.envs:ManagerBasedRLEnv", disable_env_checker=True, kwargs={ - "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeEnvCfg_PLAY", + "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeEnvCfg_PLAY", "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubePPORunnerCfg", }, ) @@ -27,7 +27,7 @@ entry_point="isaaclab.envs:ManagerBasedRLEnv", disable_env_checker=True, kwargs={ - "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeNoVelObsEnvCfg", + "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeNoVelObsEnvCfg", "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubeNoVelObsPPORunnerCfg", }, ) @@ -37,7 +37,7 @@ entry_point="isaaclab.envs:ManagerBasedRLEnv", disable_env_checker=True, kwargs={ - "env_cfg_entry_point": f"{__name__}.wato_env_cfg:WatoHandCubeNoVelObsEnvCfg_PLAY", + "env_cfg_entry_point": f"{__name__}.wato_hand_env_cfg:WatoHandCubeNoVelObsEnvCfg_PLAY", "rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:WatoHandCubeNoVelObsPPORunnerCfg", }, ) diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_hand_env_cfg.py similarity index 78% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_env_cfg.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_hand_env_cfg.py index 98ae49c7..55c508ad 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_env_cfg.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/config/wato_hand/wato_hand_env_cfg.py @@ -3,11 +3,7 @@ import HumanoidRLPackage.HumanoidRLSetup.tasks.inhand.inhand_env_cfg as inhand_env_cfg import HumanoidRLPackage.HumanoidRLSetup.tasks.inhand.mdp as inhand_mdp -from HumanoidRLPackage.HumanoidRLSetup.modelCfg.wato_hand import ( - INHAND_WATO_HAND_CFG, - INHAND_CUBE_POS, - INHAND_SPREAD_RAD, -) +from HumanoidRLPackage.HumanoidRLSetup.modelCfg.wato_hand import INHAND_SPREAD_RAD @configclass @@ -17,22 +13,15 @@ class WatoHandCubeEnvCfg(inhand_env_cfg.InHandObjectEnvCfg): def __post_init__(self): super().__post_init__() - # Share physics across envs — lower RAM per env, scales to more parallel rollouts. self.scene.replicate_physics = True self.scene.num_envs = 2048 - self.scene.robot = INHAND_WATO_HAND_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot") # Push expanded MCP_A limits (±27 deg) into PhysX, overriding the ±8.6 deg baked in the USD. self.events.expand_abduction_limits = EventTerm( func=inhand_mdp.apply_wato_hand_joint_limits, mode="startup", ) - # change the cube on plam's property - self.scene.object.spawn.scale = (0.8, 0.8, 0.8) - self.scene.object.init_state.pos = INHAND_CUBE_POS - self.scene.object.init_state.rot = (1.0, 0.0, 0.0, 0.0) - _grasp_scale = [0.2, 0.2] _splay_scale = [1.0, 1.0] self.events.reset_robot_joints.params["position_range"] = { @@ -62,7 +51,7 @@ def __post_init__(self): # Full 3D random orientation is too hard to explore from scratch; z-axis # rotation (spinning in the palm plane) is the most natural motion for this hand. # Once orientation_error shows a downward trend, expand back to ["x", "y"]. - self.commands.object_pose.rotation_axes = ["z"] + self.commands.cube_pose.rotation_axes = ["z"] # Small bonus for MCP_A velocity + spread deflection. self.rewards.spread_activity = RewTerm( @@ -81,12 +70,13 @@ def __post_init__(self): super().__post_init__() self.scene.num_envs = 16 self.observations.policy.enable_corruption = False - # Keep time_out so play episodes reset instead of clamping forever. + # Keep time_out so play episodes reset faster to see the next set self.episode_length_s = 10.0 # Nudge goal marker aside so it does not cover the physical cube. - self.commands.object_pose.marker_pos_offset = (-0.10, 0.0, 0.12) + self.commands.cube_pose.marker_pos_offset = (-0.10, 0.0, 0.12) +# Version with no velocity observation as input because in a real deployment, measuring velocity is noisy / unavailable @configclass class WatoHandCubeNoVelObsEnvCfg(WatoHandCubeEnvCfg): def __post_init__(self): @@ -100,7 +90,5 @@ def __post_init__(self): super().__post_init__() self.scene.num_envs = 16 self.observations.policy.enable_corruption = False - # Keep time_out so play episodes reset instead of clamping forever. self.episode_length_s = 10.0 - # Nudge goal marker aside so it does not cover the physical cube. - self.commands.object_pose.marker_pos_offset = (-0.10, 0.0, 0.12) + self.commands.cube_pose.marker_pos_offset = (-0.10, 0.0, 0.12) diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand.md b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand.md new file mode 100644 index 00000000..f6518e01 --- /dev/null +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand.md @@ -0,0 +1,88 @@ +# In-Hand: Wato hand cube reorientation + +In-hand dexterous manipulation for the 20-DOF Wato hand (`hand_urdf.usd`) in Isaac Lab. The policy reorients a DexCube held in the palm toward a commanded goal orientation. **Play** shows a ghost goal cube marker (enabled via `WatoHandCubeEnvCfg_PLAY`). + +Task setup and MDP code are adapted from [Isaac Lab](https://github.com/isaac-sim/IsaacLab) in-hand manipulation examples. Experimentation history: `TRAINING_LOG.md`. + +**Environments** + +| Task ID | Scene | Mode | +| :--- | :--- | :--- | +| `Isaac-Repose-Cube-WatoHand-v0` | Palm + DexCube | Train | +| `Isaac-Repose-Cube-WatoHand-Play-v0` | Palm + DexCube | Play | +| `Isaac-Repose-Cube-WatoHand-NoVelObs-v0` | Palm + DexCube | Train (no velocity obs) | +| `Isaac-Repose-Cube-WatoHand-NoVelObs-Play-v0` | Palm + DexCube | Play (no velocity obs) | + +## Train & play + +Run from `HumanoidRL/` (the directory that contains `HumanoidRLPackage/`): + +```bash +# Train (default 2048 envs; try 1024 or 512 if OOM) +PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/train.py \ + --task=Isaac-Repose-Cube-WatoHand-v0 --headless + +# Play — omit --headless to see goal-orientation marker +PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/play.py \ + --task=Isaac-Repose-Cube-WatoHand-Play-v0 --num_envs=1 + +# Play — specific checkpoint +PYTHONPATH=$(pwd) /home/hy/IsaacLab/isaaclab.sh -p HumanoidRLPackage/rsl_rl_scripts/play.py \ + --task=Isaac-Repose-Cube-WatoHand-Play-v0 --num_envs=1 \ + --checkpoint logs/rsl_rl/wato_hand_cube//model_.pt +``` + +Checkpoints: `logs/rsl_rl/wato_hand_cube/`. PPO defaults: `max_iterations=5000`, `experiment_name=wato_hand_cube` (`config/wato_hand/agents/rsl_rl_ppo_cfg.py`). + +## Scene & command + +| Item | Value | +| :--- | :--- | +| Robot | `INHAND_WATO_HAND_CFG` (`modelCfg/wato_hand.py`) — palm-up, 20 DOF | +| Cube | Isaac Nucleus DexCube (instanceable), scale `(0.8, 0.8, 0.8)`, spawn `INHAND_CUBE_POS` | +| Goal command | `InHandReOrientationCommand` — resampled **on success** (not on a timer) | +| Goal position | Default cube spawn + `init_pos_offset = (0, 0, -0.04)` m (hold-in-palm target) | +| Goal orientation | Sampled on allowed `rotation_axes`; success when error **< 0.4 rad** | +| Curriculum (current) | `rotation_axes = ["z"]` — palm-normal spin only (`wato_hand_env_cfg.py`) | +| Startup event | `expand_abduction_limits` — overrides USD-baked MCP_A limits to ±27° | + +`replicate_physics=True` on Wato hand (instanceable-friendly). + +## Reward + +All base reward terms are defined in `RewardsCfg` (`inhand_env_cfg.py`). Implementations live in `mdp/rewards.py` (task-specific) and Isaac Lab `mdp` (generic penalties). `WatoHandCubeEnvCfg` adds one extra term — `spread_activity` — in `config/wato_hand/wato_hand_env_cfg.py`. + +| Category | Reward Function | Weight | Description | +| :--- | :--- | :--- | :--- | +| **Task** | Position tracking (`track_pos_l2`) | −3.0 | L2 distance to goal position (encourages holding cube in palm). | +| | Orientation tracking (`track_orientation_inv_l2`) | 10.0 | $1 / (\text{orientation\_error} + 0.1)$ — dense rotation signal. | +| | Success bonus (`success_bonus`) | 50.0 | Binary reward when orientation error **< 0.4 rad**. | +| | Object held bonus (`object_held_bonus`) | 0.5 | +1/step when cube within **0.10 m** of goal position. | +| | Angular velocity toward goal (`object_ang_vel_toward_goal`) | 0.2 | Positive component of cube spin aligned with goal (clamped ≥ 0). | +| | Spread activity (`mcp_a_spread_activity`) | 0.03 | Encourages MCP_A abduction velocity + deflection (Wato only). | +| **Penalties** | Object away (`object_away_penalty`) | −5.0 | Terminal penalty when `object_out_of_reach` fires. | +| | Action rate L2 | −0.05 | Penalises jerky finger commands. | +| | Joint velocity L2 | $-1.0 \times 10^{-4}$ | Penalises high joint speeds. | +| | Action L2 | $-1.0 \times 10^{-4}$ | Penalises large action magnitudes. | + +## Terminations + +| Termination | Condition | +| :--- | :--- | +| Time out | Episode length exceeds **20 s** (10 s in Play). | +| Max consecutive success | Goal reached **50 times** in a row within one episode. | +| Object out of reach | Cube drifts **> 0.3 m** from robot root. | +| Orientation stagnation | Orientation error stays **> 0.5 rad** for **150** consecutive steps. | + +## Sim settings + +| Setting | Value | +| :--- | :--- | +| `decimation` | 4 | +| `sim.dt` | 1/120 s (~120 Hz) | +| `episode_length_s` | 20.0 | +| Default `num_envs` | 2048 | +| Action smoothing | EMA joint-position targets, `alpha = 0.85` | +| PPO `num_steps_per_env` | 48 | +| PPO `gamma` | 0.998 | +| PPO `entropy_coef` | 0.0001 | diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py index 35beedf8..cca88f0c 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/inhand_env_cfg.py @@ -1,7 +1,5 @@ from __future__ import annotations -from dataclasses import MISSING - import isaaclab.sim as sim_utils from isaaclab.assets import ArticulationCfg, AssetBaseCfg, RigidObjectCfg from isaaclab.envs import ManagerBasedRLEnvCfg @@ -19,17 +17,19 @@ from isaaclab.utils.noise import AdditiveGaussianNoiseCfg as Gnoise import HumanoidRLPackage.HumanoidRLSetup.tasks.inhand.mdp as mdp +from HumanoidRLPackage.HumanoidRLSetup.modelCfg.wato_hand import INHAND_CUBE_POS, INHAND_WATO_HAND_CFG @configclass class InHandObjectSceneCfg(InteractiveSceneCfg): - robot: ArticulationCfg = MISSING + robot: ArticulationCfg = INHAND_WATO_HAND_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot") - object: RigidObjectCfg = RigidObjectCfg( - prim_path="{ENV_REGEX_NS}/object", + cube: RigidObjectCfg = RigidObjectCfg( + prim_path="{ENV_REGEX_NS}/cube", spawn=sim_utils.UsdFileCfg( usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Blocks/DexCube/dex_cube_instanceable.usd", + scale=(0.8, 0.8, 0.8), rigid_props=sim_utils.RigidBodyPropertiesCfg( kinematic_enabled=False, disable_gravity=False, @@ -42,7 +42,7 @@ class InHandObjectSceneCfg(InteractiveSceneCfg): ), mass_props=sim_utils.MassPropertiesCfg(density=400.0), ), - init_state=RigidObjectCfg.InitialStateCfg(pos=(0.0, -0.19, 0.56), rot=(1.0, 0.0, 0.0, 0.0)), + init_state=RigidObjectCfg.InitialStateCfg(pos=INHAND_CUBE_POS, rot=(1.0, 0.0, 0.0, 0.0)), ) light = AssetBaseCfg( @@ -59,7 +59,7 @@ class InHandObjectSceneCfg(InteractiveSceneCfg): @configclass class CommandsCfg: - object_pose = mdp.InHandReOrientationCommandCfg() + cube_pose = mdp.InHandReOrientationCommandCfg() @configclass @@ -88,26 +88,26 @@ class KinematicObsGroupCfg(ObsGroup): # -- object terms object_pos = ObsTerm( - func=mdp.root_pos_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("object")} + func=mdp.root_pos_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("cube")} ) object_quat = ObsTerm( - func=mdp.root_quat_w, params={"asset_cfg": SceneEntityCfg("object"), "make_quat_unique": False} + func=mdp.root_quat_w, params={"asset_cfg": SceneEntityCfg("cube"), "make_quat_unique": False} ) object_lin_vel = ObsTerm( - func=mdp.root_lin_vel_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("object")} + func=mdp.root_lin_vel_w, noise=Gnoise(std=0.002), params={"asset_cfg": SceneEntityCfg("cube")} ) object_ang_vel = ObsTerm( func=mdp.root_ang_vel_w, scale=0.2, noise=Gnoise(std=0.002), - params={"asset_cfg": SceneEntityCfg("object")}, + params={"asset_cfg": SceneEntityCfg("cube")}, ) # -- command terms - goal_pose = ObsTerm(func=mdp.generated_commands, params={"command_name": "object_pose"}) + goal_pose = ObsTerm(func=mdp.generated_commands, params={"command_name": "cube_pose"}) goal_quat_diff = ObsTerm( func=mdp.goal_quat_diff, - params={"asset_cfg": SceneEntityCfg("object"), "command_name": "object_pose", "make_quat_unique": False}, + params={"asset_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose", "make_quat_unique": False}, ) # -- action terms @@ -177,7 +177,7 @@ class EventCfg: func=mdp.randomize_rigid_body_material, mode="startup", params={ - "asset_cfg": SceneEntityCfg("object"), + "asset_cfg": SceneEntityCfg("cube"), "static_friction_range": (0.7, 1.3), "dynamic_friction_range": (0.7, 1.3), "restitution_range": (0.0, 0.0), @@ -188,7 +188,7 @@ class EventCfg: func=mdp.randomize_rigid_body_mass, mode="startup", params={ - "asset_cfg": SceneEntityCfg("object"), + "asset_cfg": SceneEntityCfg("cube"), "mass_distribution_params": (0.4, 1.6), "operation": "scale", }, @@ -201,7 +201,7 @@ class EventCfg: params={ "pose_range": {"x": [-0.01, 0.01], "y": [-0.01, 0.01], "z": [-0.01, 0.01]}, "velocity_range": {}, - "asset_cfg": SceneEntityCfg("object"), + "asset_cfg": SceneEntityCfg("cube"), }, ) reset_robot_joints = EventTerm( @@ -221,17 +221,17 @@ class RewardsCfg: track_pos_l2 = RewTerm( func=mdp.track_pos_l2, weight=-3.0, - params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose"}, + params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose"}, ) track_orientation_inv_l2 = RewTerm( func=mdp.track_orientation_inv_l2, weight=10.0, - params={"object_cfg": SceneEntityCfg("object"), "rot_eps": 0.1, "command_name": "object_pose"}, + params={"object_cfg": SceneEntityCfg("cube"), "rot_eps": 0.1, "command_name": "cube_pose"}, ) success_bonus = RewTerm( func=mdp.success_bonus, weight=50.0, - params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose"}, + params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose"}, ) # -- penalties @@ -245,13 +245,13 @@ class RewardsCfg: object_held_bonus = RewTerm( func=mdp.object_held_bonus, weight=0.5, - params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose", "hold_threshold": 0.10}, + params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose", "hold_threshold": 0.10}, ) object_ang_vel_toward_goal = RewTerm( func=mdp.object_ang_vel_toward_goal, weight=0.2, - params={"object_cfg": SceneEntityCfg("object"), "command_name": "object_pose"}, + params={"object_cfg": SceneEntityCfg("cube"), "command_name": "cube_pose"}, ) # Penalty for dropping the object — critical for early training. @@ -267,7 +267,7 @@ class TerminationsCfg: time_out = DoneTerm(func=mdp.time_out, time_out=True) max_consecutive_success = DoneTerm( - func=mdp.max_consecutive_success, params={"num_success": 50, "command_name": "object_pose"} + func=mdp.max_consecutive_success, params={"num_success": 50, "command_name": "cube_pose"} ) object_out_of_reach = DoneTerm(func=mdp.object_away_from_robot, params={"threshold": 0.3}) @@ -278,7 +278,7 @@ class TerminationsCfg: orientation_stagnation = DoneTerm( func=mdp.orientation_stagnation, params={ - "command_name": "object_pose", + "command_name": "cube_pose", "orientation_error_threshold": 0.5, "stagnant_steps": 150, }, diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py index 6ba856d4..6804792e 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/commands_cfg.py @@ -9,15 +9,10 @@ @configclass class InHandReOrientationCommandCfg(CommandTermCfg): - """Configuration for the uniform 3D orientation command term. - - Please refer to the :class:`InHandReOrientationCommand` class for more details. - """ - class_type: type = InHandReOrientationCommand resampling_time_range: tuple[float, float] = (1e6, 1e6) # no resampling based on time - asset_name: str = "object" + asset_name: str = "cube" """Scene asset that receives orientation goals (the manipuland rigid body).""" init_pos_offset: tuple[float, float, float] = (0.0, 0.0, -0.04) @@ -44,6 +39,7 @@ class InHandReOrientationCommandCfg(CommandTermCfg): debug_vis: bool = True + """Goal-orientation debug marker (DexCube visual only).""" goal_pose_visualizer_cfg: VisualizationMarkersCfg = VisualizationMarkersCfg( prim_path="/Visuals/Command/goal_marker", markers={ @@ -53,4 +49,4 @@ class InHandReOrientationCommandCfg(CommandTermCfg): ), }, ) - """Goal-orientation debug marker (DexCube visual only).""" \ No newline at end of file + \ No newline at end of file diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py index 56f2a56d..53a78f29 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/commands/orientation_command.py @@ -24,27 +24,13 @@ class InHandReOrientationCommand(CommandTerm): The constant position commands is to encourage that the object does not move during the task. For instance, the object should not fall off the robot's palm. - - Unlike typical command terms, where the goals are resampled based on time, this command term - does not resample the goals based on time. Instead, the goals are resampled when the object - reaches the goal orientation. The goal orientation is considered to be reached when the - orientation error is below a certain threshold. """ cfg: InHandReOrientationCommandCfg - """Configuration for the command term.""" def __init__(self, cfg: InHandReOrientationCommandCfg, env: ManagerBasedRLEnv): - """Initialize the command term class. - - Args: - cfg: The configuration parameters for the command term. - env: The environment object. - """ - # initialize the base class super().__init__(cfg, env) - # object self.object: RigidObject = env.scene[cfg.asset_name] # create buffers to store the command @@ -80,10 +66,6 @@ def command(self) -> torch.Tensor: """The desired goal pose in the environment frame. Shape is (num_envs, 7).""" return torch.cat((self.pos_command_e, self.quat_command_w), dim=-1) - """ - Implementation specific functions. - """ - def _update_metrics(self): # logs data # -- compute the orientation error diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py index 6b330db4..b9534bda 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/rewards.py @@ -13,7 +13,7 @@ def success_bonus( - env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("object") + env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("cube") ) -> torch.Tensor: """Bonus reward for successfully reaching the goal. @@ -23,7 +23,7 @@ def success_bonus( Args: env: The environment object. command_name: The command term to be used for extracting the goal. - object_cfg: The configuration for the scene entity. Default is "object". + object_cfg: The configuration for the scene entity. Default is "cube". """ # extract useful elements asset: RigidObject = env.scene[object_cfg.name] @@ -40,7 +40,7 @@ def success_bonus( def track_pos_l2( - env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("object") + env: ManagerBasedRLEnv, command_name: str, object_cfg: SceneEntityCfg = SceneEntityCfg("cube") ) -> torch.Tensor: """Reward for tracking the object position using the L2 norm. @@ -49,7 +49,7 @@ def track_pos_l2( Args: env: The environment object. command_term: The command term to be used for extracting the goal. - object_cfg: The configuration for the scene entity. Default is "object". + object_cfg: The configuration for the scene entity. Default is "cube". """ # extract useful elements asset: RigidObject = env.scene[object_cfg.name] @@ -66,7 +66,7 @@ def track_pos_l2( def track_orientation_inv_l2( env: ManagerBasedRLEnv, command_name: str, - object_cfg: SceneEntityCfg = SceneEntityCfg("object"), + object_cfg: SceneEntityCfg = SceneEntityCfg("cube"), rot_eps: float = 1e-3, ) -> torch.Tensor: """Reward for tracking the object orientation using the inverse of the orientation error. @@ -76,7 +76,7 @@ def track_orientation_inv_l2( Args: env: The environment object. command_name: The command term to be used for extracting the goal. - object_cfg: The configuration for the scene entity. Default is "object". + object_cfg: The configuration for the scene entity. Default is "cube". rot_eps: The threshold for the orientation error. Default is 1e-3. """ asset: RigidObject = env.scene[object_cfg.name] @@ -92,7 +92,7 @@ def track_orientation_inv_l2( def object_ang_vel_toward_goal( env: ManagerBasedRLEnv, command_name: str, - object_cfg: SceneEntityCfg = SceneEntityCfg("object"), + object_cfg: SceneEntityCfg = SceneEntityCfg("cube"), ) -> torch.Tensor: """Reward the component of object angular velocity that reduces orientation error. @@ -121,7 +121,7 @@ def object_ang_vel_toward_goal( def object_held_bonus( env: ManagerBasedRLEnv, command_name: str, - object_cfg: SceneEntityCfg = SceneEntityCfg("object"), + object_cfg: SceneEntityCfg = SceneEntityCfg("cube"), hold_threshold: float = 0.10, ) -> torch.Tensor: """Per-step bonus (1.0) when object is within hold_threshold of the goal position. diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py index a09fe9f4..3a8a2a15 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/inhand/mdp/terminations.py @@ -31,7 +31,7 @@ def object_away_from_goal( env: ManagerBasedRLEnv, threshold: float, command_name: str, - object_cfg: SceneEntityCfg = SceneEntityCfg("object"), + object_cfg: SceneEntityCfg = SceneEntityCfg("cube"), ) -> torch.Tensor: """Check if object has gone far from the goal. @@ -42,7 +42,7 @@ def object_away_from_goal( env: The environment object. threshold: The threshold for the distance between the robot and the object. command_name: The command term to be used for extracting the goal. - object_cfg: The configuration for the scene entity. Default is "object". + object_cfg: The configuration for the scene entity. Default is "cube". """ # extract useful elements command_term: InHandReOrientationCommand = env.command_manager.get_term(command_name) @@ -59,7 +59,7 @@ def object_away_from_robot( env: ManagerBasedRLEnv, threshold: float, asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"), - object_cfg: SceneEntityCfg = SceneEntityCfg("object"), + object_cfg: SceneEntityCfg = SceneEntityCfg("cube"), ) -> torch.Tensor: """Check if object has gone far from the robot. @@ -70,7 +70,7 @@ def object_away_from_robot( env: The environment object. threshold: The threshold for the distance between the robot and the object. asset_cfg: The configuration for the robot entity. Default is "robot". - object_cfg: The configuration for the object entity. Default is "object". + object_cfg: The configuration for the object entity. Default is "cube". """ # extract useful elements robot = env.scene[asset_cfg.name] @@ -89,7 +89,7 @@ def orientation_stagnation( stagnant_steps: int, object_ang_vel_threshold: float | None = None, joint_vel_threshold: float | None = None, - object_cfg: SceneEntityCfg = SceneEntityCfg("object"), + object_cfg: SceneEntityCfg = SceneEntityCfg("cube"), robot_cfg: SceneEntityCfg = SceneEntityCfg("robot"), ) -> torch.Tensor: """Terminate when orientation error stays above threshold for too many steps. diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/__init__.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/__init__.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/__init__.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/__init__.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/__init__.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/__init__.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/__init__.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/__init__.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/__init__.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/agents/rsl_rl_ppo_cfg.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/joint_pos_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/joint_pos_env_cfg.py similarity index 90% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/joint_pos_env_cfg.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/joint_pos_env_cfg.py index f587e9bc..c38c3e61 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/HumanoidRLEnv/joint_pos_env_cfg.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/HumanoidRLEnv/joint_pos_env_cfg.py @@ -1,5 +1,5 @@ from isaaclab.utils import configclass -from HumanoidRLPackage.HumanoidRLSetup.tasks.manipulation.reach_env_cfg import ReachEnvCfg +from HumanoidRLPackage.HumanoidRLSetup.tasks.reach.reach_env_cfg import ReachEnvCfg @configclass diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/__init__.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/config/__init__.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/config/__init__.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/__init__.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/__init__.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/__init__.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/__init__.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/rewards.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/rewards.py similarity index 100% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/mdp/rewards.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/mdp/rewards.py diff --git a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/reach_env_cfg.py b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/reach_env_cfg.py similarity index 98% rename from autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/reach_env_cfg.py rename to autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/reach_env_cfg.py index 664cd88b..b9f9774d 100644 --- a/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/manipulation/reach_env_cfg.py +++ b/autonomy/simulation/Humanoid_Wato/HumanoidRL/HumanoidRLPackage/HumanoidRLSetup/tasks/reach/reach_env_cfg.py @@ -12,7 +12,7 @@ from isaaclab.utils import configclass from isaaclab.utils.noise import AdditiveUniformNoiseCfg as Unoise -import HumanoidRLPackage.HumanoidRLSetup.tasks.manipulation.mdp as mdp +import HumanoidRLPackage.HumanoidRLSetup.tasks.reach.mdp as mdp from HumanoidRLPackage.HumanoidRLSetup.modelCfg.humanoid_arm_hand import ARM_CFG diff --git a/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py b/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py index a8a16ecd..b46b9b9c 100644 --- a/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py +++ b/autonomy/simulation/Humanoid_Wato/wato_hand/wato_hand_cfg.py @@ -1,14 +1,4 @@ -""" -20-DOF Wato hand articulation config. - -Robot model: Humanoid_Wato/wato_hand (hand_urdf.usd / hand_urdf.urdf) -Joint limits: Isaac Sim Physics Inspector (/World/hand_urdf/hand_origin) -Finger splay (MCP_A_1..4): ±8.594 deg (±0.15 rad) per joint. - - import sys, os - sys.path.insert(0, os.path.join(, "autonomy/simulation/Humanoid_Wato/wato_hand")) - from wato_hand_cfg import WATO_HAND_CFG, JOINT_POS_LIMITS, apply_joint_limits -""" +"""20-DOF Wato hand articulation config.""" import math import os @@ -25,8 +15,8 @@ def _deg(degrees: float) -> float: return degrees * math.pi / 180.0 -# --- Joint limits from Physics Inspector (degrees -> rad) -------------------- -_MCP_A_LIMIT = (_deg(-27.0), _deg(27.0)) # expanded to AllegroHand range; self-collision is off +# --- Joint limits (degrees -> rad) -------------------- +_MCP_A_LIMIT = (_deg(-27.0), _deg(27.0)) # expanded to wider range for now; self-collision is off _MCP_FLEX_LIMIT = (0.0, _deg(85.944)) _PIP_FLEX_LIMIT = (_deg(-85.944), 0.0) _DIP_FLEX_LIMIT = (0.0, _deg(85.944)) @@ -54,7 +44,7 @@ def _deg(degrees: float) -> float: "MCP_2": _MCP_FLEX_LIMIT, "PIP_2": _PIP_FLEX_LIMIT, "DIP_2": _DIP_FLEX_LIMIT, - # Ring (finger 3) — slightly different flex range + # Ring (finger 3) "MCP_A_3": _MCP_A_LIMIT, "MCP_3": _MCP_FLEX_LIMIT_RING, "PIP_3": _PIP_FLEX_LIMIT_RING, @@ -66,7 +56,6 @@ def _deg(degrees: float) -> float: "DIP_4": _DIP_FLEX_LIMIT_RING, } -# Ordered like config/joint_names_hand_urdf.yaml (excluding the empty root slot). ALL_JOINT_NAMES = [ "circumduction", "MCP_A_thumb", @@ -90,7 +79,6 @@ def _deg(degrees: float) -> float: "DIP_4", ] -# All 20 revolute DOF (matches config/joint_names_hand_urdf.yaml). ACTUATED_JOINT_NAMES = list(ALL_JOINT_NAMES) _DEFAULT_JOINT_POS = {name: 0.0 for name in ALL_JOINT_NAMES} diff --git a/docker/simulation/isaac_il/QUICKSTART.md b/docker/simulation/isaac_il/QUICKSTART.md index 577bf5bb..b28e2aaf 100644 --- a/docker/simulation/isaac_il/QUICKSTART.md +++ b/docker/simulation/isaac_il/QUICKSTART.md @@ -1,4 +1,4 @@ -# SO101 sim IL — copy-paste commands +# SO101 sim IL Exact workflow used for **HF dataset → ACT train → sim eval** (no physical arm). Stack: Isaac Lab 2.3.2 / Sim 5.1 / LeRobot 0.4.3 / ACT / watod `simulation_il`.