Summary
coretrace-stack-analyzer can crash when it analyzes a compile_commands.json
batch with more than one worker. The failure happens while the analyzer compiles
source files to LLVM IR through compilerlib::compile(...), which runs Clang
frontend actions in-process.
The observed crash is not caused by CoreTrace CLI parsing or by cross-TU summary
logic. It is triggered by parallel module loading/compilation inside the stack
analyzer.
Environment
- Platform: macOS arm64
- LLVM/Clang: Homebrew LLVM 20.1.2
- Binary:
ctrace, linked against libclang-cpp.dylib and libLLVM.dylib
- Shell stack limit:
8176 KB
- Hardware concurrency observed by the analyzer:
8
Reproduction
From the parent CoreTrace checkout that embeds coretrace-stack-analyzer:
./build/ctrace \
--compile-commands=./build/compile_commands.json \
--invoke ctrace_stack_analyzer \
--config config/tool-config.json
With stack_analyzer.jobs unset/empty, the analyzer resolves to jobs=auto and
starts multiple workers.
The crash also reproduces with cross-TU disabled:
{
"stack_analyzer": {
"jobs": "2",
"resource_cross_tu": false,
"uninitialized_cross_tu": false
}
}
jobs=2 is enough to reproduce. jobs=1 completes successfully.
Actual behavior
The process exits with a native crash shortly after:
== CoreTrace == [INFO] Running specific tools on 16 file(s)
== CoreTrace == [INFO] Running CoreTrace Stack Analyzer on 16 files
bus error
or:
illegal hardware instruction
Under lldb, the actual stop reason is an EXC_BAD_ACCESS in Clang Sema:
* thread #4, stop reason = EXC_BAD_ACCESS (code=2, address=0x16ff1bb58)
frame #0: libclang-cpp.dylib`CheckConvertibilityForTypeTraits(...) + 136
The failing instruction writes to the current stack:
libclang-cpp.dylib`CheckConvertibilityForTypeTraits:
-> stp x21, x24, [sp, #0x28]
Registers at the crash:
sp = 0x000000016ff1bb30
pc = libclang-cpp.dylib`CheckConvertibilityForTypeTraits(...) + 136
The faulting address is sp + 0x28, and the stack pointer is in an inaccessible
region:
memory region $sp
[0x000000016ff18000-0x000000016ff1c000) ---
This points to a worker thread stack overflow while Clang is deeply instantiating
C++ templates.
Expected behavior
The analyzer should either:
- complete analysis successfully, or
- report a per-translation-unit compilation/loading failure without crashing the
hosting process.
ctrace should not be terminated by a native crash from an embedded analyzer
worker.
Relevant code path
The CoreTrace bridge invokes the analyzer in-process:
ctrace::stack::app::runAnalyzerApp(std::move(parseResult.parsed));
The analyzer schedules module loading in worker threads:
runParallelWork(inputFilenames.size(), loadJobs,
[&](std::size_t index) { loadSingleModule(index); });
Each worker calls:
analysis::loadModuleForAnalysis(inputFilename, cfg, *moduleContext, localErr);
The input pipeline compiles non-IR inputs through compilerlib:
return compilerlib::compile(compileArgs, outputMode);
compilerlib executes Clang frontend actions in-process, including:
clang::EmitBCAction
clang::EmitLLVMAction
clang::EmitLLVMOnlyAction
Verification results
The following matrix was observed:
| Configuration |
Result |
jobs=auto, cross-TU enabled |
crashes |
jobs=auto, cross-TU disabled |
crashes |
jobs=2, cross-TU disabled |
crashes |
jobs=1, cross-TU enabled |
exits 0 |
jobs=1, cross-TU disabled |
exits 0 |
This isolates the failure to parallel in-process Clang compilation/loading, not
to cross-TU resource or uninitialized summary construction.
Workaround
Set:
{
"stack_analyzer": {
"jobs": "1"
}
}
This serializes module loading/compilation and avoids the stack overflow in the
observed environment.
Proposed fix direction
Avoid treating the same jobs setting as safe for in-process Clang compilation.
The current architecture is fast, but it gives Clang frontend crashes the same
blast radius as the analyzer process.
Recommended direction:
- Introduce a dedicated module loading / compile execution policy.
- Serialize source-to-IR compilation when using in-process
compilerlib.
- Keep parallelism for analysis phases that operate on already-loaded modules.
- Prefer subprocess isolation for Clang compilation as the robust long-term
path. If a subprocess crashes, the analyzer can report a failed TU instead of
crashing the parent process.
Increasing worker thread stack size via platform-specific thread attributes can
reduce this specific crash, but it is less robust than isolating Clang
compilation or serializing the in-process frontend. The issue is generic: any
template-heavy translation unit can exceed a worker stack when compiled
in-process.
Summary
coretrace-stack-analyzercan crash when it analyzes acompile_commands.jsonbatch with more than one worker. The failure happens while the analyzer compiles
source files to LLVM IR through
compilerlib::compile(...), which runs Clangfrontend actions in-process.
The observed crash is not caused by CoreTrace CLI parsing or by cross-TU summary
logic. It is triggered by parallel module loading/compilation inside the stack
analyzer.
Environment
ctrace, linked againstlibclang-cpp.dylibandlibLLVM.dylib8176 KB8Reproduction
From the parent CoreTrace checkout that embeds
coretrace-stack-analyzer:With
stack_analyzer.jobsunset/empty, the analyzer resolves tojobs=autoandstarts multiple workers.
The crash also reproduces with cross-TU disabled:
{ "stack_analyzer": { "jobs": "2", "resource_cross_tu": false, "uninitialized_cross_tu": false } }jobs=2is enough to reproduce.jobs=1completes successfully.Actual behavior
The process exits with a native crash shortly after:
or:
Under
lldb, the actual stop reason is anEXC_BAD_ACCESSin Clang Sema:The failing instruction writes to the current stack:
Registers at the crash:
The faulting address is
sp + 0x28, and the stack pointer is in an inaccessibleregion:
This points to a worker thread stack overflow while Clang is deeply instantiating
C++ templates.
Expected behavior
The analyzer should either:
hosting process.
ctraceshould not be terminated by a native crash from an embedded analyzerworker.
Relevant code path
The CoreTrace bridge invokes the analyzer in-process:
ctrace::stack::app::runAnalyzerApp(std::move(parseResult.parsed));The analyzer schedules module loading in worker threads:
Each worker calls:
analysis::loadModuleForAnalysis(inputFilename, cfg, *moduleContext, localErr);The input pipeline compiles non-IR inputs through
compilerlib:return compilerlib::compile(compileArgs, outputMode);compilerlibexecutes Clang frontend actions in-process, including:Verification results
The following matrix was observed:
jobs=auto, cross-TU enabledjobs=auto, cross-TU disabledjobs=2, cross-TU disabledjobs=1, cross-TU enabledjobs=1, cross-TU disabledThis isolates the failure to parallel in-process Clang compilation/loading, not
to cross-TU resource or uninitialized summary construction.
Workaround
Set:
{ "stack_analyzer": { "jobs": "1" } }This serializes module loading/compilation and avoids the stack overflow in the
observed environment.
Proposed fix direction
Avoid treating the same
jobssetting as safe for in-process Clang compilation.The current architecture is fast, but it gives Clang frontend crashes the same
blast radius as the analyzer process.
Recommended direction:
compilerlib.path. If a subprocess crashes, the analyzer can report a failed TU instead of
crashing the parent process.
Increasing worker thread stack size via platform-specific thread attributes can
reduce this specific crash, but it is less robust than isolating Clang
compilation or serializing the in-process frontend. The issue is generic: any
template-heavy translation unit can exceed a worker stack when compiled
in-process.