Skip to content

fix(dataset rm): delete staging files from a uid-65532 pod, not jobs-manager (#259)#78

Open
LukasWodka wants to merge 1 commit into
developfrom
fix/259-dataset-rm-teardown-uid
Open

fix(dataset rm): delete staging files from a uid-65532 pod, not jobs-manager (#259)#78
LukasWodka wants to merge 1 commit into
developfrom
fix/259-dataset-rm-teardown-uid

Conversation

@LukasWodka

Copy link
Copy Markdown
Contributor

Summary

Fixes the tracebloc dataset rm teardown failure in tracebloc/client#259: the table dropped but the staging files leaked with Permission denied, and the "re-run" advice never worked.

Root cause (refined from the issue)

The leftover files are written by the CLI's ephemeral stage pod (uid 65532 + FSGroup 65532), not the ingestor. The teardown then exec'd rm -rf inside the long-lived jobs-manager pod (a different non-root uid, no shared fsGroup). A non-65532 uid can't delete 65532-owned files in a non-group-writable directory → EACCES, orphans left behind. Re-running hit the same wall every time.

(Full three-UID analysis + why fsGroup-on-jobs-manager is a no-op on hostPath is in the issue #259 refinement comment.)

Fix

Run the teardown rm from a short-lived pod that mirrors the stage pod's identity (uid 65532 + FSGroup 65532, shared PVC mounted), reusing the existing BuildStagePodSpec / CreateStagePod / WaitForStagePodReady / DeleteStagePod machinery (same pattern as push.Stage).

The teardown pod owns the staging files it deletes, so it works by ownership on hostPath (where fsGroup is a no-op, kubernetes/kubernetes#138411) and CSI alike. This fully fixes tabular datasets (no sidecar files) on every volume type — the common case and the current blocker.

Secondary:

  • Teardown now takes an injectable Executor (matching push.Stage), so the exec path is unit-testable.
  • dataset_rm drops the misleading "re-run completes the cleanup" claim — the table DROP is idempotent, and if file removal keeps failing the message points to node-side cleanup.

Scope / what's left

Refs #259 (not auto-closing). The image/sidecar case — the ingestor's /data/shared/<table> files written as uid 65534 — is deletable by this fix on CSI (via fsGroup) but not on hostPath. Closing that needs the documented complement (ingestor fsGroup 65532 + group-writable DEST_PATH in client-runtime / data-ingestors). Tabular datasets are fully covered now.

Test plan

  • go build ./..., go vet, gofmt — clean.
  • go test ./... — all packages pass.
  • New regression test TestTeardown_RemovesViaStageIdentityPod pins the fix: the rm runs in a tracebloc-stage-* pod with RunAsUser/FSGroup = 65532 (container stage), not the jobs-manager pod, with the correct rm -rf <paths>, and the teardown pod is cleaned up afterward.

🤖 Generated with Claude Code

…manager (#259)

`tracebloc dataset rm` dropped the table but failed to delete the dataset's
staging files on the shared PVC:

  rm: cannot remove '/data/shared/.tracebloc-staging/<t>/labels.csv': Permission denied

Root cause: the staging files are written by the CLI's ephemeral stage pod as
uid 65532 (+ fsGroup 65532), but the teardown exec'd `rm` inside the long-lived
jobs-manager pod, which runs as a different non-root uid with no shared fsGroup.
A non-65532 uid cannot delete 65532-owned files in a non-group-writable dir, so
the rm hit EACCES and left orphans. The "re-run to clean up" advice was a dead
end — the same permission error every time.

Fix: run the teardown `rm` from a short-lived pod that mirrors the stage pod's
identity (uid 65532 + fsGroup 65532, shared PVC mounted), reusing the existing
BuildStagePodSpec / CreateStagePod / WaitForStagePodReady / DeleteStagePod
machinery. That pod OWNS the staging files it deletes, so it works by ownership
on hostPath (where fsGroup is a no-op, kubernetes/kubernetes#138411) and CSI
alike. Fully fixes tabular datasets (no sidecar files) on every volume type.

Also:
- Teardown now takes an injectable Executor (matching push.Stage), enabling a
  regression test that pins "rm runs in a uid-65532 stage pod, not jobs-manager".
- dataset_rm: drop the misleading "re-run completes the cleanup" claim; the table
  DROP is idempotent, and if file removal keeps failing, point to node-side cleanup.

Refs #259. The image/sidecar case (ingestor's /data/shared/<table> written as
uid 65534) on hostPath still needs the documented complement (ingestor fsGroup +
group-writable DEST_PATH in client-runtime/data-ingestors).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LukasWodka

Copy link
Copy Markdown
Contributor Author

👋 Heads-up — Code review queue is at 16 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants