fix(replace): handle one new row conflicting with multiple old rows via different unique keys#24581
Conversation
…ia different unique keys REPLACE INTO raised a duplicate-key error when a single new row conflicted with several existing rows through different unique keys (issue matrixorigin#24428). The expected MySQL semantics are: delete all conflicting old rows, then insert the new row once. Root causes and fixes: 1. hashbuild (real PK + multi-UK): the OR'd LEFT JOIN fans one new row into several build rows carrying different old PKs. keep-last keeps one and turns the rest into delete-only rows, but their old-PK column was nulled and the delColIdx delete-marker pass only scanned surviving rows, so the surviving bucket was not marked deleted. When the new PK also matched an existing row, the dedup-join probe then raised a false DuplicateEntry. Preserve the old-PK column on delete-only rows and scan them in the delColIdx pass. 2. bind_replace (fake PK): fake-PK tables built the conflict-detection LEFT JOIN from only the first unique key, missing conflicts on the others. OR one condition per unique key, matching the real-PK path. Adds a hashbuild unit test for the discarded-fan-out delete marking and BVT coverage for multi-UK / explicit-PK+multi-UK / fake-PK / composite-UK fan-out. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
aunjgr
left a comment
There was a problem hiding this comment.
No blocking issues found.
This follow-up completes the REPLACE multi-unique-key fan-out path: it preserves the old-PK value on delete-only rows, scans the full build batch when marking delete buckets, and adds unit/BVT coverage for the previously failing fan-out cases.
XuPeng-SH
left a comment
There was a problem hiding this comment.
No blocking issues found.
I checked the multi-UK fan-out fix from a few angles: the fake-PK LEFT JOIN now matches conflicts on any unique key, the hashbuild keep-last path now preserves/scans the old-PK values needed to mark surviving conflict buckets as deleted, and the new unit/BVT coverage matches the previously failing REPLACE cases. The change looks consistent with the existing delete-only / DelRows / dedup-join flow.
Merge Queue Status
This pull request spent 47 minutes 24 seconds in the queue, with no time running CI. Waiting for
All conditions
ReasonThe merge conditions cannot be satisfied due to failing checks HintYou may have to fix your CI before adding the pull request to the queue again. |
Merge Queue Status
This pull request spent 1 hour 6 minutes 23 seconds in the queue, including 1 hour 5 minutes 32 seconds running CI. Required conditions to merge
|
What type of PR is this?
Which issue(s) this PR fixes:
issue #24428
What this PR does / why we need it:
REPLACE INTOproduced a duplicate-key error when a single new row conflicted withmultiple existing rows via different unique keys. The expected MySQL semantics are:
delete all conflicting old rows, then insert the new row exactly once.
This fixes two cases:
Real PK + multiple unique keys (
pkg/sql/colexec/hashbuild/hashmap.go)The OR'd LEFT JOIN fans one new row into several build rows that carry different
old PKs.
keep-lastkeeps one of them and converts the rest into delete-only rows,but their old-PK column was nulled and the
delColIdxdelete-marker pass onlyscanned the surviving rows. So the surviving bucket was not marked deleted, and when
the new PK also matched an existing row the dedup-join probe raised a false
Duplicate entry. Fix: keep the old-PK column on the delete-only rows and let thedelColIdxpass scan them, so the surviving bucket is correctly marked deleted.Fake PK + multiple unique keys (
pkg/sql/plan/bind_replace.go)Fake-PK tables built the conflict-detection LEFT JOIN from only the first unique
key, missing conflicts on the others. Fix: OR one (AND-of-parts) condition per unique
key, matching the real-PK path.
The single-UK fix (#24425) and the duplicate-source-key keep-last behavior (#24497) are
preserved.
How verified
TestDedupBuildKeepLastMarksConflictBucketForDiscardedFanout(fails without the fix, passes with it).
test/distributed/cases/dml/replace/replace.{test,result}coveringmulti-UK / explicit-PK + multi-UK / fake-PK / composite-UK / multi-row fan-out.
multi_update unit tests pass;
make static-checkis clean.🤖 Generated with Claude Code