Conversation
…ator and streamline debug handling
…rity and parameter naming
…auto-download functionality
…e-tuned and stock modes
- Deleted `graph_utils.py`, which contained functions for adjacency matrix creation and normalization. - Removed `lifter3d.py`, which included keypoint processing, 3D triangulation, and visualization functions. - Eliminated `mocap_dataset.py`, which defined the `MocapDataset` class for handling motion capture data.
… root path accordingly
…uperAnimalEstimator
…uperAnimalEstimator
… and reuse across images, improving efficiency and clarity.
…h/CUDA installation notes
deruyter92
left a comment
There was a problem hiding this comment.
Great PR which definitely improves the package. I really like the addition of the fine-tuned SuperAnimal 2D model!
A few remarks:
- small bug in partial cleanup for rat7m
- the lazy downloading from hugginface is not working as I think you intended it
- the
predict()method should be cleaned a bit - it would be great if you add tests for the new auto-download branch
Overall good PR! See comments
| def build_2d_estimator(): | ||
| """Build the 2D pose estimator once. Snapshot resolves lazily on first predict. | ||
|
|
||
| Empty --saved_2d_model_path -> auto-download fine-tuned snapshot from HF. | ||
| Non-empty path -> use as a local override. | ||
| """ | ||
| from fmpose3d.common.config import SuperAnimalConfig | ||
| from fmpose3d.inference_api.fmpose3d import SuperAnimalEstimator | ||
| from fmpose3d.utils.weights import resolve_weights_path | ||
|
|
There was a problem hiding this comment.
Well done refactoring this: way cleaner, and also more efficient! Few comments:
- The docstring seems to contain an error: the statement "snapshot resolves lazily on first predict" is not correct, since it is resolved immediately.
- The
resolve_weights_pathseems to download from HF directly with an empty path, which seems to be inconsistent with the approach elsewhere (letting it trigger by the predict method) - Minor nitpick: I think the imports in this case can stay on the top of the file. I would lazily import only for heavy packages (like deeplabcut) or modules that are super specific for a single function. These are all lightweight central helpers, so might belong on the top of the file instead.
| def build_2d_estimator(): | |
| """Build the 2D pose estimator once. Snapshot resolves lazily on first predict. | |
| Empty --saved_2d_model_path -> auto-download fine-tuned snapshot from HF. | |
| Non-empty path -> use as a local override. | |
| """ | |
| from fmpose3d.common.config import SuperAnimalConfig | |
| from fmpose3d.inference_api.fmpose3d import SuperAnimalEstimator | |
| from fmpose3d.utils.weights import resolve_weights_path | |
| def build_2d_estimator(): | |
| """Build the 2D pose estimator once. | |
| Empty --saved_2d_model_path -> auto-download fine-tuned snapshot from HF. | |
| Non-empty path -> use as a local override. | |
| """ | |
| print(f" - Left hind leg: {graph_rat.left_hind}") | ||
| print(f" - Right hind leg: {graph_rat.right_hind}") | ||
| print(f" - Spine: {graph_rat.spine}") | ||
| print(f" Distance to center (joint 4): {graph_rat.dist_center}") No newline at end of file |
There was a problem hiding this comment.
I think this one was forgotten in the removal of the Rat7M code..
| print(f" Distance to center (joint 4): {graph_rat.dist_center}") |
| pose_snapshot_path = cfg.pose_snapshot_path | ||
| if not pose_snapshot_path and cfg.auto_download_finetuned: | ||
| from fmpose3d.utils.weights import resolve_weights_path | ||
| pose_snapshot_path = resolve_weights_path("", "sa_finetune_hrnet_w32.pt") |
There was a problem hiding this comment.
when auto-download is True and the path is not provided, resolve_weights_path is called on every predict call. (i.e. hf_hub_download checks the local cache on every call)
I think this could add up for videos with many frames. Instead, this should be resolved once (the first predict call)! e.g. you could define an attribute in __init__ that contains the downloaded weights path after the first download? or a simple flag.
| # Fine-tuned mode: non-empty resolved path swaps the stock 39-joint head | ||
| # for a custom DLC checkpoint that predicts the 26-joint Animal3D layout | ||
| # natively (no _map_keypoints needed). | ||
| is_finetuned = bool(pose_snapshot_path) |
There was a problem hiding this comment.
Same here, this can be resolved in __init__. (right now, all information is derived from a static config, which is available at initialization time)
|
|
||
|
|
||
| def resolve_weights_path(model_weights_path: str, model_type: str) -> str: | ||
| def resolve_weights_path(local_path: str, filename: str) -> str: |
There was a problem hiding this comment.
I think it's fine right now (since nobody is probably using this function right now), but we should be careful with renaming keyword arguments, as they can break peoples scripts.
i.e. this is not backward compatible for people who used to handle the weights in their own scripts:
from fmpose3d.utils import resolve_weights_path
configured_path = ""
my_weights_path = resolve_weights_path(model_weights_path=configured_path) # <- breaks now!
or more concerning:
from fmpose3d.utils import resolve_weights_path
my_weights_path = resolve_weights_path(model_type="fmpose3d_humans") # <- breaks now!
There was a problem hiding this comment.
TL;DR I think its fine for now, as you updated all the call sites internally, but be aware that people might use these public functions in their own scripts as well. We should try to keep all public functions backward compatible whenever possible.
There was a problem hiding this comment.
In case this happens in the future, we could add a deprecation warning for cases that are more impactful than this minor change.
| # Default to fine-tuned + lazy HF auto-download so the animal API | ||
| # works out-of-the-box. Construction stays cheap (no network); | ||
| # the download fires on the first predict() call. | ||
| return ( | ||
| SuperAnimalEstimator(SuperAnimalConfig(auto_download_finetuned=True)), | ||
| AnimalPostProcessor(), | ||
| ) | ||
| return HRNetEstimator(), HumanPostProcessor() |
There was a problem hiding this comment.
This seems to be inconsistent with how vis_animals.py resolves the path.
- Here, is is allowed to be handled lazily in the
predict()method. - In
build_2d_estimator()the weights are downloaded directly and passed aspose_snapshot_path.
See my other comments in vis_animals.py. I think you intended the lazy handling in both, and I agree that it is probably better!
| """ | ||
| FMPose3D: monocular 3D Pose Estimation via Flow Matching | ||
|
|
||
| Official implementation of the paper: | ||
| "FMPose3D: monocular 3D Pose Estimation via Flow Matching" | ||
| by Ti Wang, Xiaohang Yu, and Mackenzie Weygandt Mathis | ||
| Licensed under Apache 2.0 | ||
| """ | ||
|
|
||
| """Bundled DLC ``pytorch_config.yaml`` files for the animal 2D detector. | ||
|
|
||
| These yamls describe FMPose3D's fine-tuned SuperAnimal-Quadruped variants | ||
| and are loaded by :class:`fmpose3d.inference_api.SuperAnimalEstimator` when | ||
| the user does not supply an explicit ``pytorch_config_path``. They are | ||
| shipped as package data (see ``pyproject.toml`` ``[tool.setuptools.package-data]``). | ||
| """ |
There was a problem hiding this comment.
| """ | |
| FMPose3D: monocular 3D Pose Estimation via Flow Matching | |
| Official implementation of the paper: | |
| "FMPose3D: monocular 3D Pose Estimation via Flow Matching" | |
| by Ti Wang, Xiaohang Yu, and Mackenzie Weygandt Mathis | |
| Licensed under Apache 2.0 | |
| """ | |
| """Bundled DLC ``pytorch_config.yaml`` files for the animal 2D detector. | |
| These yamls describe FMPose3D's fine-tuned SuperAnimal-Quadruped variants | |
| and are loaded by :class:`fmpose3d.inference_api.SuperAnimalEstimator` when | |
| the user does not supply an explicit ``pytorch_config_path``. They are | |
| shipped as package data (see ``pyproject.toml`` ``[tool.setuptools.package-data]``). | |
| """ | |
| """ | |
| FMPose3D: monocular 3D Pose Estimation via Flow Matching | |
| Official implementation of the paper: | |
| "FMPose3D: monocular 3D Pose Estimation via Flow Matching" | |
| by Ti Wang, Xiaohang Yu, and Mackenzie Weygandt Mathis | |
| Licensed under Apache 2.0 | |
| Bundled DLC ``pytorch_config.yaml`` files for the animal 2D detector. | |
| These yamls describe FMPose3D's fine-tuned SuperAnimal-Quadruped variants | |
| and are loaded by :class:`fmpose3d.inference_api.SuperAnimalEstimator` when | |
| the user does not supply an explicit ``pytorch_config_path``. They are | |
| shipped as package data (see ``pyproject.toml`` ``[tool.setuptools.package-data]``). | |
| """ |
There was a problem hiding this comment.
Actually, I'm realizing that it would have probably been better to include the copyright header as comment (i.e. using #) instead of with a docstring. As the whole thing now appears when running help(), instead of only the module docstring.
| patch( | ||
| "deeplabcut.pose_estimation_pytorch.apis.superanimal_analyze_images", | ||
| ) as mock_fn: | ||
| mock_fn.return_value = {"frame.png": {"bodyparts": fake_bp}} |
There was a problem hiding this comment.
is this working correctly? The code writes frames like "frame_000000.png" right?
Summary
This PR adds first-class support for the fine-tuned SuperAnimal-Quadruped 2D pose model used by the animal pipeline, enabling direct 26-joint Animal3D keypoint prediction and automatic checkpoint download from Hugging Face. It also improves the out-of-the-box demo/install path, fixes CPU fallback for the human HRNet demo, and removes unused legacy code/assets.
Changes
sa_finetune_hrnet_w32.ptfor 2D animal posefmpose3d_animals.pthfor the 3D lifteranimals/demo/vis_animals.pyto build the 2D estimator and 3D lifter once, then reuse them across images.SuperAnimalConfigoptions for fine-tuned checkpoints, detector overrides, and lazy Hugging Face resolution.map_locationand moving inputs to the model device.torch>=2.4.1,<2.5andtorchvision>=0.19.1,<0.20, and document the PyTorch/CUDA behavior in the README.>=3.10,<3.13; README recommends Python 3.10 because install/demo paths were tested there.motto the codespell ignore list.Validation
Ran install, test, and demo checks locally:
python3 -m pip install -e '.[animals,viz]' --dry-run python3 -m pytest tests/test_demo_human.py tests/fmpose3d_api/test_fmpose3d.py -q python3 -m pytest tests/test_model.py tests/test_training_pipeline.py -q bash demo/vis_in_the_wild.sh bash animals/demo/vis_animals.shResults
78 passedfor human demo/API tests,8 passedfor model/training smoke tests.