Skip to content

False Positive: Invalid Base Reconstruction on Range-For Pointer-Slot Induction #76

@SizzleUnrlsd

Description

@SizzleUnrlsd

Problem

InvalidBaseReconstruction reports a false positive on normal C++ range-for
iteration over a local fixed-size array, for example:

constexpr std::string_view cExtensions[] = {".c", ".h"};

for (const auto& ext : cExtensions)
{
    if (filename.ends_with(ext))
        return LanguageType::C;
}

The diagnostic reports cExtensions as an invalid base reconstruction and prints
a huge list of offsets:

source member: offsets base, +16, +32, +48, ...
offset applied: +16 bytes
derived pointer points OUTSIDE the valid object range

Root Cause

The issue is in collectPointerOrigins.

For compiler-generated range-for IR, the loop iterator is stored in a pointer
slot:

%cur = load ptr, ptr %slot
%next = getelementptr ..., ptr %cur, 1
store ptr %next, ptr %slot

collectPointerOrigins follows:

load slot -> pointer alloca -> stores -> gep(load slot, +16) -> load slot -> ...

Because visited is keyed by (Value, offset), each new +16 offset is
considered a new state. The cycle never converges and the analysis keeps
generating offsets until the work budget is exhausted.

This is not a real offsetof/container_of reconstruction. It is a bounded
iterator induction.

Expected Behavior

Normal range-for iteration over a local array must not produce an
InvalidBaseReconstruction warning or error.

Existing invalid container_of and wrong-offset cases must still be detected.

Proposed Fix

Detect self-referential pointer-slot induction stores:

store gep(load same_pointer_slot, constant_step), same_pointer_slot

and avoid treating that store as a new free pointer origin.

Also compact large offset diagnostics as a defensive measure, while preserving
existing short diagnostic formats.

Acceptance Criteria

  • No InvalidBaseReconstruction diagnostic for range-for over a local array.
  • Existing offset_of-container_of positive tests still pass.
  • Add a regression fixture similar to the original false positive.
  • No hardcoded filtering on std::string_view, variable names, or source file names.
  • Implementation remains generic over LLVM GEP and DataLayout behavior.

Test Plan

Run:

cmake --build build
python3 -c 'from pathlib import Path; import run_test; run_test.RUN_CONFIG.analyzer = Path("build/stack_usage_analyzer"); ok = True; total = 0; passed = 0
for path in sorted(Path("test/offset_of-container_of").glob("*.c")) + sorted(Path("test/offset_of-container_of").glob("*.cpp")):
    result, t, p, report = run_test.check_file(path)
    ok = ok and result
    total += t
    passed += p
print(f"offset_of-container_of targeted result: {passed}/{total}")
raise SystemExit(0 if ok else 1)'

Suggested Commit Messages

fix(analysis): bound pointer-slot induction in origin tracking
test(analysis): cover pointer-slot induction false positive

Metadata

Metadata

Assignees

Labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions