Native Windows (MSYS2/MinGW-w64) build for ABACUS#7423
Open
ErjieWu wants to merge 19 commits into
Open
Conversation
Lay the groundwork for a native Windows serial plane-wave build
(no MPI, no LCAO, no ELPA/PEXSI/hybrid). Targets MinGW-w64 GCC, which
ships the POSIX headers ABACUS uses and accepts its GCC attributes, so
the source needs only minimal, Linux-safe portability shims.
- source_base/fs_compat.h (new): portable ModuleBase::make_directory()
wrapping _mkdir (Windows) / mkdir(path,0755) (POSIX). The Windows CRT
mkdir takes no permission-mode argument.
- global_file.cpp, global_function.cpp: route the 7 mkdir(path,0755)
call sites through the helper; drop unistd.h/sys/stat.h includes.
- CMakeLists.txt:
* gate find_package(ScaLAPACK REQUIRED) on ENABLE_MPI so the serial
build does not require a distributed-memory library;
* define _USE_MATH_DEFINES/NOMINMAX/_CRT_SECURE_NO_WARNINGS on WIN32;
* skip -O3 -g default flags and the -lm link for MSVC;
* skip the post-install abacus symlink on Windows.
- tools/windows/build-native-serial.ps1 (new): MinGW configure/build helper.
- docs/advanced/install_windows_native.md (new): native-build documentation.
All changes are guarded or platform-neutral; the Linux build is unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With these fixes the native Windows serial plane-wave build (abacus_pw_ser.exe, MinGW-w64 GCC + OpenBLAS + FFTW) compiles, links, and runs examples/02_scf/01_pw_Si2 to SCF convergence with a deterministic total energy (-215.5057 eV, bit-identical across runs). Build-system fixes: - cmake/FindBlas.cmake, cmake/FindLapack.cmake: the wrappers delegate to CMake's builtin FindBLAS/FindLAPACK, but on the case-insensitive Windows filesystem the wrapper matched itself and recursed forever. Drop our module dir from CMAKE_MODULE_PATH around the builtin call (no-op on Linux). Source portability fixes (all guarded or platform-neutral; Linux unaffected): - module_fft/fft_base.h, fft_cpu.h: remove __attribute__((weak)) from the FFT virtuals. The weak-without-definition pattern relied on the ELF linker resolving unbound weak symbols to null; on Windows/PE (MinGW) it produced null vtable slots, so the first FFT dispatch (FFT_Bundle::setupFFT) called address 0 and segfaulted. Base virtuals get trivial default bodies; the float overrides become concrete via ENABLE_FLOAT_FFTW=ON. - module_parameter/input_conv.h: port the POSIX <regex.h> expression parser to C++ <regex> (MinGW has no <regex.h>). - module_container/base/core/cpu_allocator.cpp: replace posix_memalign with _aligned_malloc/_aligned_free on Windows, applied consistently to both allocate overloads and free. - module_restart/restart.cpp: map POSIX S_IRUSR/S_IWUSR to _S_IREAD/_S_IWRITE and include <io.h> for low-level open/read/write/close on Windows. Tooling/docs: - tools/windows/build-native-serial.ps1: use the verified flags (BLA_VENDOR=OpenBLAS, ENABLE_FLOAT_FFTW=ON, COMMIT_INFO=OFF, the GCC-16 force-include workaround). - docs/advanced/install_windows_native.md: document the gcc-fortran package, the verified build/run, and every source change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
psi_initializer::random_t, in the pw_seed>0 branch, generates per-stick
random amplitude/phase into stickrr/stickarg and then distributes them
into the gathered tmprr/tmparg arrays via stick_to_pool() -- but that call
is guarded by #ifdef __MPI. In a serial build tmprr/tmparg therefore stay
zero-initialized, so every seeded random wavefunction is all-zero. This
later trips Gram-Schmidt orthonormalization ("psi_norm <= 0.0") and aborts
the run. The path is never hit in CI because the integration tests run
under MPI.
Add the serial counterpart: copy each stick directly into tmprr/tmparg
using the same mapping as stick_to_pool()'s rank-0 branch
(out[ixy2is_[ir]*nz + iz] = stick[iz]). ixy2is_ is populated for both
serial and MPI builds via pw_wfc_->getfftixy2is().
Verified on a representative set of 15 tests/01_PW cases run with the
native Windows serial PW build (abacus_pw_ser.exe): all converged total
energies now match the official result.ref references to <= ~7e-7 eV.
Before this fix the 6 cases using pw_seed with random wavefunctions
aborted; the other 9 already matched to ~1e-9 eV.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ke scripts Per review feedback, the native-Windows support should plug into ABACUS's existing build/test infrastructure (like any other backend/variant) rather than carry its own scripts. Build: add a Windows toolchain variant, mirroring toolchain_gnu.sh / build_abacus_gnu.sh: - toolchain/toolchain_windows.sh -- installs the MinGW-w64 prerequisites via pacman on MSYS2 (gcc, gfortran, openblas, fftw, cmake, ninja) plus bc for the test harness; records the prefix in install/setup like the Linux variants. - toolchain/build_abacus_windows.sh -- configures + builds the serial PW binary (ENABLE_MPI/LCAO=OFF, OpenBLAS+FFTW) and writes abacus_env.sh. Removed the one-off tools/windows/build-native-serial.ps1. Test: reuse tests/integrate/Autotest.sh instead of a separate script. Added a serial mode: with -n 0 the harness runs the binary directly (no mpirun), so a serial build (any OS) reuses the standard catch_properties.sh / result.ref comparison. Added tests/integrate/CASES_SERIAL_PW.txt listing serial-PW cases. Validation (build_abacus_windows.sh, then Autotest.sh -n 0 -f CASES_SERIAL_PW.txt): all 15 01_PW cases run; total energies/forces/stresses match the Linux result.ref to ~1e-7 relative. The few WARNINGs (016/017 etot ~1e-7 eV; 003/009/019 stress/force) are absolute-threshold exceedances from cross-platform / cross-BLAS floating point, classified WARNING (not ERROR) by the harness. docs/advanced/install_windows_native.md updated to describe the toolchain + serial-Autotest flow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per review: the serial PW build should be checked against the existing PW test
suite (tests/01_PW) via the standard harness, not a hand-picked subset.
- Remove tests/integrate/CASES_SERIAL_PW.txt. The canonical list already exists
at tests/01_PW/CASES_CPU.txt and is used by the standard ctest registration
(tests/01_PW/CMakeLists.txt runs Autotest.sh from that directory). Serial runs
just add -n 0:
cd tests/01_PW
bash ../integrate/Autotest.sh -a <abacus_pw_ser.exe> -n 0
- .gitattributes: force LF for *.sh and CASES_*.txt so the toolchain scripts,
Autotest.sh and the bash-parsed case lists work on a fresh Windows checkout
(core.autocrlf would otherwise rewrite them to CRLF).
- docs/advanced/install_windows_native.md: document the whole-01_PW serial run.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirror the Linux toolchain UX: `source abacus_env.sh` then run `abacus`. build_abacus_windows.sh now copies the configured binary (abacus_pw_ser.exe) to abacus.exe in the build dir. Native Windows symlinks need elevation (so the CMake `abacus` symlink step is skipped on WIN32); the .exe copy lets a bare `abacus` resolve in the MSYS2 shell and in cmd/PowerShell. abacus_env.sh already puts that directory (and the MinGW runtime DLLs via the toolchain setup) on PATH. Verified: source abacus_env.sh; abacus --version -> runs from any directory. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Binstream::Binstream/open pass the caller's fopen mode ("r"/"w"/"a")
straight through. On Windows that opens in *text* mode, which translates
CRLF and treats 0x1A as EOF, corrupting the binary wavefunction/charge
files Binstream is built to read -> "Error in Binstream: Some data didn't
be read". On POSIX "r" == "rb", so the bug is Windows-only.
Binstream is always a binary stream, so append "b" to the mode when the
caller omitted it. Harmless no-op on Linux.
Fixes these serial 01_PW cases on the native Windows build (verified):
- 056_PW_IW (init_wfc=file: read wfc from binary file)
- 057_PW_SO_IW (SOC + init_wfc=file)
- 075_PW_CHG_BINARY (binary charge I/O)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Structure_Factor::bspline_sf (nbspline>0, B-spline structure factor) scatters each real-space plane into tmpr via Parallel_Grid::zpiece_to_all, which is guarded by #ifdef __MPI. In a serial build tmpr is never filled (it is new double[nrxx], uninitialized), so real2recip(tmpr, strucFac) produces a garbage structure factor -> grossly wrong total energy, force and stress. CI never hits this path (integration tests run under MPI). Add the serial branch: fill tmpr directly using the SAME real-space layout as zpiece_to_all's serial path, rho[ir*nczp + znow] (xy outer, z innermost; nczp==nz, znow==iz when serial). Verified on tests/01_PW/032_PW_15_CF_CS_bspline (native Windows serial): energy and stress now match the reference to ~1e-8 (was ~1480 eV / 30000 kbar off); residual force ~5e-3 is B-spline interpolation + cross-platform float noise. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s not a bug) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds an experimental native Windows (MSYS2/MinGW-w64) build path for ABACUS (Phase 1: serial plane-wave only) while keeping Linux behavior unchanged, and extends the existing test harness to support running without an MPI launcher (-n 0).
Changes:
- Introduces a Windows toolchain variant (pacman-installed OpenBLAS/FFTW + Ninja/CMake) and build script to produce a real Windows executable.
- Improves portability across Windows/case-insensitive filesystems (CMake FindBLAS/LAPACK recursion fix, binary file I/O mode, mkdir compatibility, Windows CRT permission bits).
- Fixes serial-only correctness issues surfaced by running the PW test suite without MPI (seeded random wavefunction init + B-spline structure factor grid fill), and adds serial mode support to
Autotest.sh.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| toolchain/toolchain_windows.sh | Installs MSYS2/MinGW-w64 dependencies via pacman and writes an env setup file. |
| toolchain/build_abacus_windows.sh | Configures/builds a serial PW Windows executable and generates abacus_env.sh. |
| tests/integrate/Autotest.sh | Adds -n 0 serial mode (no mpirun) and adjusts OpenMP thread defaulting. |
| source/source_pw/module_pwdft/structure_factor.cpp | Initializes tmpr in serial builds to avoid garbage structure factors. |
| source/source_psi/psi_initializer.cpp | Fixes seeded random wavefunction initialization in serial (non-MPI) builds. |
| source/source_io/module_restart/restart.cpp | Adds Windows CRT compatibility for open() mode bits and includes <io.h>. |
| source/source_io/module_parameter/input_conv.h | Replaces POSIX <regex.h> parsing with portable std::regex. |
| source/source_io/module_output/binstream.cpp | Forces binary fopen mode by ensuring 'b' is present. |
| source/source_base/module_fft/fft_cpu.h | Removes __attribute__((weak)) from FFT_CPU virtuals (Windows/PE safety). |
| source/source_base/module_fft/fft_base.h | Provides non-null default virtual bodies to avoid PE null vtable slots. |
| source/source_base/module_container/base/core/cpu_allocator.cpp | Uses _aligned_malloc/_aligned_free on Windows for aligned allocations. |
| source/source_base/global_function.cpp | Switches directory creation to a new portable helper. |
| source/source_base/global_file.cpp | Switches directory creation to a new portable helper. |
| source/source_base/fs_compat.h | Adds portable ModuleBase::make_directory() wrapper. |
| docs/advanced/install_windows_native.md | Documents the experimental native Windows build and serial PW testing flow. |
| CMakeLists.txt | Adds Windows portability defines; gates ScaLAPACK on ENABLE_MPI; skips -lm and symlink install for Windows/MSVC. |
| cmake/FindLapack.cmake | Avoids infinite recursion on case-insensitive filesystems by temporarily adjusting CMAKE_MODULE_PATH. |
| cmake/FindBlas.cmake | Avoids infinite recursion on case-insensitive filesystems by temporarily adjusting CMAKE_MODULE_PATH. |
| .gitattributes | Forces LF endings for bash scripts and CASES_*.txt to avoid CRLF issues on Windows. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…s off before_scf() unconditionally dereferenced *(two_center_bundle_.overlap_orb_alpha) to pass it to deepks.build_overlap(). overlap_orb_alpha is only built when DeePKS is enabled (descriptor orbitals); with DeePKS off it is a null unique_ptr, so forming the reference is undefined behaviour (caught as an abort in a debug libstdc++ build; benign in release as the DeePKS stub ignores it). Guard the call on the integrator being present. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the native-Windows toolchain to the full supported configuration, mirroring build_abacus_gnu.sh: - toolchain_windows.sh: also pacman-install cereal (LCAO), msmpi (MPI), and scalapack (distributed LCAO eigensolver). Documents that the MS-MPI runtime is a separate system-wide Microsoft redistributable. - build_abacus_windows.sh: build MPI + LCAO by default (abacus_basic_para.exe); ENABLE_MPI / ENABLE_LCAO env toggles select serial / PW-only. Point FindMPI at the MinGW MS-MPI import lib; ScaLAPACK is found automatically when ENABLE_MPI. abacus_env.sh now also exports OPENBLAS_NUM_THREADS=1 (required so OpenBLAS's multithread buffer allocator does not fail under multiple MPI ranks). - docs/advanced/install_windows_native.md: document the LCAO+MPI build, parallel testing (mpiexec / mpirun shim), and the known serial gamma-only LCAO bug (use the MPI build, which is correct to ~1e-11 even on a single rank). Validated against 01_PW / 02_NAO_Gamma / 03_NAO_multik via the standard harness: under MPI all three pass within the cross-platform error range; residual differences are float noise at strict absolute thresholds, gauge-dependent outputs, or excluded features (SCAN/meta-GGA needs LibXC, DFT+U needs MPI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Running tests/integrate/Autotest.sh directly failed with "no mpirun found":
MS-MPI ships only mpiexec, and the harness invokes `mpirun -np N`. Three
Windows-specific gaps, all fixed in build_abacus_windows.sh so the standard
harness works unchanged:
* mpirun shim. The build now drops an `mpirun`->`mpiexec` shim next to the
binary (on PATH via abacus_env.sh). MS-MPI's `-n`/`-np <N> <prog>` syntax
matches what the harness passes, so forwarding args is enough.
* OpenBLAS thread pinning. MSYS2's OpenBLAS is OpenMP-threaded (links libgomp),
so OMP_NUM_THREADS -- not OPENBLAS_NUM_THREADS -- caps its threads. Autotest
sets OMP_NUM_THREADS=nproc/np, so each rank spawned a multithreaded BLAS, the
ranks oversubscribed the cores, and OpenBLAS's buffer allocator died
("Memory allocation still failed after 10 retries"). The shim and abacus_env.sh
now pin OMP_NUM_THREADS=1 (ABACUS is built USE_OPENMP=OFF, so parallelism is
MPI; the BLAS pin costs nothing).
* DLL bundling. mpiexec does not propagate PATH to child ranks when stdout is
redirected to a file (as the harness does), so the child abacus.exe failed to
load libopenblas/libfftw3/libscalapack ("error while loading shared
libraries"). The build now copies the dependent MinGW/OpenBLAS/FFTW/ScaLAPACK
DLLs next to abacus.exe; Windows searches the application directory before
PATH, making the binary self-contained.
Verified end to end with the default invocation `bash Autotest.sh -a abacus`
(np=4, via the shim): 01_PW/001, 02_NAO_Gamma/scf_afm (gamma-only LCAO), and
03_NAO_multik/scf_pp_upf201 all pass. Corrects the earlier docs/notes that
cited OPENBLAS_NUM_THREADS and a hand-made shim.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The mpirun shim died with `exec: mpiexec: not found`: MSYS2's MinGW shell does not inherit the Windows PATH, and MS-MPI's mpiexec.exe lives in its own Bin dir (only msmpi.dll is in System32). The MSMPI_BIN env var (set by the MS-MPI installer) *is* inherited, so abacus_env.sh now prepends `cygpath -u "$MSMPI_BIN"` to PATH, making both `mpiexec` and the shim resolve. Verified from a minimal PATH: which mpiexec/mpirun both resolve and 01_PW/001 passes via the default harness invocation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two issues from code review of the Windows-port commits:
1. FFT_CPU<float> undefined references on Linux (regression). The port removed
__attribute__((weak)) from the FFT virtuals (it left null vtable slots on
PE/MinGW and crashed). But the real FFT_CPU<float> methods live in
fft_cpu_float.cpp, which is compiled only when ENABLE_FLOAT_FFTW=ON. With
weak gone and float off (the Linux default), the FFT_CPU<float> vtable --
still emitted wherever the class is constructed (FFT_Bundle) -- referenced
undefined symbols:
undefined reference to `ModuleBase::FFT_CPU<float>::setupFFT()' ...
Provide trivial FFT_CPU<float> method definitions in the always-compiled
fft_cpu.cpp, guarded by `#if !defined(__ENABLE_FLOAT_FFTW)`, so every vtable
slot is valid on any ABI without weak and without pulling in libfftw3f. The
float CPU path stays unreachable at runtime (FFT_Bundle::setupFFT
WARNING_QUITs for single/mixing CPU FFT unless the macro is set). When the
macro is on, the stubs are excluded and fft_cpu_float.cpp supplies the real
definitions -- no duplicate symbols. Verified by linking the float vtable TU
against fft_cpu.o in both macro states (off: links via stubs; on: links via
fft_cpu_float.o), and that dropping both reproduces the reported errors.
2. parse_expression (input_conv.h) could push indeterminate values into vec.
If std::regex_search found no match, sub_str stayed empty and was parsed
anyway; in the non-multiplication branch `T occ` was uninitialized and the
`convert >> occ` extraction was unchecked. Now: a no-match token is an input
error (WARNING_QUIT), occ is value-initialized, and a failed extraction
fails fast. Consistent with the other expression parsers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rework the FFT_CPU<float> vtable handling so Linux builds byte-for-byte as
upstream and only Windows gets a delta. My earlier port had (a) removed
__attribute__((weak)) outright and (b) added trivial float stubs in
fft_cpu.cpp -- both changed working Linux core code, and (b) didn't even reach
targets that compile fft_bundle.cpp without linking fft_cpu.cpp (e.g.
MODULE_HAMILT_XCTest_VXC), so Linux still failed to link:
undefined reference to `ModuleBase::FFT_CPU<float>::setupFFT()' ...
Root cause: the upstream virtuals are __attribute__((weak)) so the ELF linker
nulls the unused FFT_CPU<float> vtable slots when ENABLE_FLOAT_FFTW is off.
MinGW/PE has no equivalent -- weak template members there collide
("multiple definition") or leave null slots that crash on dispatch (verified
both empirically with g++).
Fix, keeping Linux untouched:
* Introduce ABACUS_FFT_WEAK = __attribute__((weak)) on non-Windows, empty on
_WIN32, and use it in place of the raw attribute in fft_base.h / fft_cpu.h.
Preprocessing with -U_WIN32 reproduces the upstream headers exactly (14 weak
attrs, no extra defs); fft_cpu.cpp is reverted to pristine.
* On Windows the empty macro makes the slots ordinary symbols; the build
already sets ENABLE_FLOAT_FFTW=ON, so fft_cpu_float.cpp supplies the real
FFT_CPU<float> methods. The non-pure FFT_BASE<T> virtuals (which had no body,
relying on weak) get trivial bodies in a `#if defined(_WIN32)` block -- never
executed (abstract base; backends override what they use). This block is
compiled only on Windows.
Verified with MinGW g++: constructing FFT_CPU<float> and dispatching through
its vtable links (no multiple-definition, no undefined base/derived refs) and
runs (no null-vtable crash); and the Linux-simulated preprocess output matches
upstream.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Windows build defaulted to -j nproc. On a 20-core box, 20 concurrent -O3 compilations of heavy template TUs (source_cell/module_symmetry/symmetry.cpp, read_pp_upf201.cpp, ...) exhausted memory and ninja died with "cc1plus.exe: out of memory allocating N bytes" -- even with 31 GB RAM. Default -j is now min(nproc, MemTotalGB / 3) (~3 GB budget per job), read from /proc/meminfo; an explicit -j still overrides, and the chosen value is printed with a hint to lower it if cc1plus runs out of memory. Falls back to nproc if /proc/meminfo is unreadable. Not a code issue -- the sources compiled fine up to the OOM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was a working note for the native-Windows build trial, not reference documentation for the repository. Drop it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds the ability to build and run ABACUS natively on Windows (without WSL), and fixes several latent bugs uncovered along the way. The native build supports plane-wave (PW) and numerical-atomic-orbital (LCAO) bases, in both serial and MPI (MS-MPI) configurations, using the MSYS2/MinGW-w64 GCC toolchain with OpenBLAS, FFTW, and ScaLAPACK.
The guiding principle throughout: do not change working Linux code. Every Windows-specific change is isolated behind #if defined(_WIN32) (or a macro that expands to the exact upstream token on non-Windows), so Linux/macOS builds are byte-for-byte unchanged. The only exceptions are genuine bug fixes (noted below), which improve correctness on all platforms.
New feature: native Windows build
Rather than introducing bespoke Windows scripts, the build plugs into the existing toolchain/ infrastructure and the existing integration-test harness, mirroring the other backends (gnu, intel, gcc-mkl, …):
Testing uses the standard harness unchanged — no separate Windows test script and no separate case list:
Three Windows-specific gaps were closed so the unmodified harness can drive MS-MPI: MS-MPI ships only mpiexec (not mpirun); MSYS2's OpenBLAS is OpenMP-threaded so OMP_NUM_THREADS (not OPENBLAS_NUM_THREADS) must be pinned to avoid its allocator failing under multiple ranks; and mpiexec doesn't propagate PATH to child ranks when stdout is redirected, which the bundled DLLs solve.
Build-system / portability changes (Linux unaffected)
Bug fixes
These are genuine correctness fixes (not Windows-only workarounds):
Test harness
Scope / not included
The following remain disabled on Windows (excluded by design, not regressions): ELPA, PEXSI, hybrid functionals (LibRI/LibComm), DeePKS/ML-KEDF, LibXC (so meta-GGA/SCAN), GPU (CUDA/ROCm), DSP. Test cases that require them are expected to fail.
Known limitations
Testing