Skip to content

v0.6.3rc1: hardware validation across macOS / Windows / Linux + GPUs #759

Description

@waltsims

Tracking pre-release validation of v0.6.3rc1 before promoting to stable v0.6.3. The binary install path was substantially rewritten — first release on the unified binary pipeline (kspacefirstorder-unified v1.4.2). Validating that the URL flip, rename-on-download, 16-arch CUDA binary, and Windows multi-arch fix all behave correctly on real hardware.

Install

pip install --pre k-wave-python
# or pinned:
pip install k-wave-python==0.6.3rc1

Smoke test recipe (any platform)

# 1. Confirm install + binary download
python -c "import kwave; print(kwave.__version__, kwave.BINARY_VERSION)"
# expected: 0.6.3rc1 v1.4.2

# 2. Confirm binaries landed at the expected paths (rename-on-download verified)
python -c "import kwave, os; print(sorted(p for p in os.listdir(kwave.BINARY_PATH) if not p.endswith('.json')))"
# expected on linux/darwin: ['kspaceFirstOrder-CUDA', 'kspaceFirstOrder-OMP']  (no -linux / -darwin suffix)
# expected on windows:     ['kspaceFirstOrder-CUDA.exe', 'kspaceFirstOrder-OMP.exe', plus the 19 shared DLLs]

# 3. Run an OMP example end-to-end
uv run examples/ivp_homogeneous_medium.py   # or whatever your env runner is

# 4. If GPU available: run the same example with backend='cpp', device='gpu'
# (or use any example that exercises the CUDA path)

Validation matrix

macOS

  • Apple Silicon (arm64) + OMP — primary supported config; downloads kspaceFirstOrder-OMP-darwin and runs cleanly. Verified 2026-06-21 on M1 (macOS 15.1): 0.6.3rc1 installs from --pre, binary lands at the expected path, python and cpp backends both run and agree bit-for-bit (corr=1.0, 0 diff) on an on-grid sensor.
  • Apple Silicon (arm64) — Homebrew runtime deps present — the darwin OMP binary is not self-contained: otool -L shows it links fftw, hdf5, zlib, libomp at hardcoded /opt/homebrew/opt/... paths, so backend="cpp" fails at launch with dyld: Library not loaded … unless they're installed. Confirm a clean machine follows the documented brew install fftw hdf5 zlib libomp (docs index / docs/get_started/new_api.rst) and that the cpp backend then runs. (Consider bundling via @rpath/delocate so this isn't a manual prerequisite.)
  • Intel Mac (x86_64) + OMP — should emit the _darwin_unsupported RuntimeWarning at import time; no CUDA path (URL_DICT['darwin']['cuda'] is [])

Linux

  • Linux + OMP (no GPU)kspaceFirstOrder-OMP downloads, example runs on CPU
  • Linux + CUDA on Turing (RTX 20xx / T4) — exercises sm_75 SASS section
  • Linux + CUDA on Ampere (A100 / RTX 30xx) — exercises sm_80 or sm_86
  • Linux + CUDA on Ada (RTX 40xx / L40) — exercises sm_89
  • Linux + CUDA on Hopper (H100 / H200) — exercises sm_90 / sm_90a
  • Linux + CUDA on Blackwell consumer (RTX 50xx / RTX PRO 6000) — exercises sm_120 / sm_120a (the new arch)
  • Linux + CUDA on Blackwell datacenter (B200 / GB200) — exercises sm_100 / sm_100a (the new arch)
  • Linux + CUDA on Volta (V100) — should emit the runtime cc<7.5 warning, binary load expected to fail with no kernel image is available for execution on the device (the warning's role is to tell users why)
  • Linux + CUDA on Pascal (GTX 10xx / P100) — same: runtime warning + expected binary load failure

Windows

  • Windows + OMP (no CUDA)kspaceFirstOrder-OMP-windows.exe + 19 shared DLLs download, example runs
  • Windows + CUDA on Turing (RTX 20xx) — kspaceFirstOrder-CUDA-windows.exe (14.8 MB) + cudart64_13.dll + cufft64_12.dll download; GPU example runs. Proves the v1.4.1 Windows regression (sm_75-only 3.4 MB binary) is fixed.
  • Windows + CUDA on Ampere / Ada / Hopper / Blackwell — any one card sufficient; same flow

What success looks like

For each row above:

  • Install completes (binaries download to the expected paths)
  • python -c "import kwave" runs without unexpected warnings (or with the expected runtime cc<7.5 warning for Maxwell/Pascal/Volta)
  • An example (any IVP / OMP / CUDA example as appropriate) runs to completion and produces sensible output

If any row fails, comment on this issue with:

  • Platform + GPU model + compute capability
  • Output of pip show k-wave-python and python -c "import kwave; print(kwave.__version__, kwave.BINARY_VERSION, kwave.BINARY_PATH)"
  • The actual error / unexpected behavior

What this gates

Promoting to stable v0.6.3. Once the matrix is reasonably covered (at minimum: one Linux + CUDA per supported arch family, one Windows + CUDA, one macOS + OMP, one Linux + OMP), I'll prep the one-line promotion PR.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions