fix(dataset rm): delete staging files from a uid-65532 pod, not jobs-manager (#259)#78
Open
LukasWodka wants to merge 1 commit into
Open
fix(dataset rm): delete staging files from a uid-65532 pod, not jobs-manager (#259)#78LukasWodka wants to merge 1 commit into
LukasWodka wants to merge 1 commit into
Conversation
…manager (#259) `tracebloc dataset rm` dropped the table but failed to delete the dataset's staging files on the shared PVC: rm: cannot remove '/data/shared/.tracebloc-staging/<t>/labels.csv': Permission denied Root cause: the staging files are written by the CLI's ephemeral stage pod as uid 65532 (+ fsGroup 65532), but the teardown exec'd `rm` inside the long-lived jobs-manager pod, which runs as a different non-root uid with no shared fsGroup. A non-65532 uid cannot delete 65532-owned files in a non-group-writable dir, so the rm hit EACCES and left orphans. The "re-run to clean up" advice was a dead end — the same permission error every time. Fix: run the teardown `rm` from a short-lived pod that mirrors the stage pod's identity (uid 65532 + fsGroup 65532, shared PVC mounted), reusing the existing BuildStagePodSpec / CreateStagePod / WaitForStagePodReady / DeleteStagePod machinery. That pod OWNS the staging files it deletes, so it works by ownership on hostPath (where fsGroup is a no-op, kubernetes/kubernetes#138411) and CSI alike. Fully fixes tabular datasets (no sidecar files) on every volume type. Also: - Teardown now takes an injectable Executor (matching push.Stage), enabling a regression test that pins "rm runs in a uid-65532 stage pod, not jobs-manager". - dataset_rm: drop the misleading "re-run completes the cleanup" claim; the table DROP is idempotent, and if file removal keeps failing, point to node-side cleanup. Refs #259. The image/sidecar case (ingestor's /data/shared/<table> written as uid 65534) on hostPath still needs the documented complement (ingestor fsGroup + group-writable DEST_PATH in client-runtime/data-ingestors). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2 tasks
Contributor
Author
|
👋 Heads-up — Code review queue is at 16 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
aptracebloc
approved these changes
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the
tracebloc dataset rmteardown failure in tracebloc/client#259: the table dropped but the staging files leaked withPermission denied, and the "re-run" advice never worked.Root cause (refined from the issue)
The leftover files are written by the CLI's ephemeral stage pod (uid 65532 +
FSGroup 65532), not the ingestor. The teardown then exec'drm -rfinside the long-lived jobs-manager pod (a different non-root uid, no sharedfsGroup). A non-65532 uid can't delete 65532-owned files in a non-group-writable directory →EACCES, orphans left behind. Re-running hit the same wall every time.(Full three-UID analysis + why
fsGroup-on-jobs-manager is a no-op on hostPath is in the issue #259 refinement comment.)Fix
Run the teardown
rmfrom a short-lived pod that mirrors the stage pod's identity (uid 65532 +FSGroup 65532, shared PVC mounted), reusing the existingBuildStagePodSpec/CreateStagePod/WaitForStagePodReady/DeleteStagePodmachinery (same pattern aspush.Stage).The teardown pod owns the staging files it deletes, so it works by ownership on
hostPath(wherefsGroupis a no-op, kubernetes/kubernetes#138411) and CSI alike. This fully fixes tabular datasets (no sidecar files) on every volume type — the common case and the current blocker.Secondary:
Teardownnow takes an injectableExecutor(matchingpush.Stage), so the exec path is unit-testable.dataset_rmdrops the misleading "re-run completes the cleanup" claim — the tableDROPis idempotent, and if file removal keeps failing the message points to node-side cleanup.Scope / what's left
Refs #259(not auto-closing). The image/sidecar case — the ingestor's/data/shared/<table>files written as uid 65534 — is deletable by this fix on CSI (viafsGroup) but not on hostPath. Closing that needs the documented complement (ingestorfsGroup 65532+ group-writableDEST_PATHinclient-runtime/data-ingestors). Tabular datasets are fully covered now.Test plan
go build ./...,go vet,gofmt— clean.go test ./...— all packages pass.TestTeardown_RemovesViaStageIdentityPodpins the fix: thermruns in atracebloc-stage-*pod withRunAsUser/FSGroup = 65532(containerstage), not the jobs-manager pod, with the correctrm -rf <paths>, and the teardown pod is cleaned up afterward.🤖 Generated with Claude Code