[refactor](storage) drop StorageField wrapper and clean up related dead code#63233
Open
csun5285 wants to merge 2 commits into
Open
[refactor](storage) drop StorageField wrapper and clean up related dead code#63233csun5285 wants to merge 2 commits into
csun5285 wants to merge 2 commits into
Conversation
StorageField was a thin wrapper over TabletColumn that cached a KeyCoder* pointer and pre-resolved an owned tree of sub-fields. Aside from those two extras, every accessor (type/length/name/is_nullable/unique_id/...) was a direct forward to the underlying TabletColumn it held a copy of, and all 11 subclasses (CharField/VarcharField/.../HllAggField) were empty stubs with zero callers distinguishing them (no dynamic_cast/typeid/static_cast). Replace StorageField with TabletColumn throughout the storage layer: - Schema now stores vector<TabletColumnPtr> instead of vector<StorageField*>, so copy/dtor are handled by shared_ptr ref counting; the deep-copy clone() path is gone. - ColumnWriter family takes TabletColumnPtr (owned by writer) instead of unique_ptr<StorageField>; get_field()/_field renamed to get_column()/_column. - IndexColumnWriter::create and ZoneMapIndexWriter::create take const TabletColumn* directly. - DataTypeFactory drops the StorageField overload (kept the existing TabletColumn one). - Five row_cursor.cpp encode sites switch to free helpers (get_key_coder(type)->...), the only consumers of StorageField's _key_coder cache. Per-call switch overhead is negligible since this is not a hot path; production hot paths (vertical/segment_writer, indexed_column_*) already cache KeyCoder locally without going through StorageField. - _has_char_type, _init_column_mapping, index_builder, variant_*, and segment_iterator switch to TabletColumn::get_sub_column/get_subtype_count in place of StorageField::get_sub_field/get_sub_field_count. - Rename row_cursor's _encode_field to _encode_column_value and column_schema(cid) to column(cid) to match the new semantics; rename local null_field/bigint_field to null_column_ptr/length_column_ptr. Net effect: -434 lines (field.h removed plus dead 11-subclass hierarchy), no behavioral change. BE UT (302 tests across tablet_schema, storage_types, KeyCoder, InvertedIndex, BkdIndex, ColumnWriter, ZoneMap, RowCursor) all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
run buildall |
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
…rs + minor cleanup
Follow-up to "drop StorageField wrapper". Three small refactors:
1. ColumnWriter::cell_size() helper
Replace 7 occurrences of `field_type_size(get_column()->type())` in
ColumnWriter::append_nullable / ScalarColumnWriter::append_* with a single
inline cell_size() method on the base class. Pure DRY, no behavior change.
2. VariantWriter ctors: drop redundant raw `column` param
VariantDocCompactWriter / VariantSubcolumnWriter / VariantColumnWriter
previously took both `(const TabletColumn* column, TabletColumnPtr owned_column)`
where the call site always passed the same column in both positions:
std::make_unique<VariantDocCompactWriter>(
opts, column, std::make_shared<TabletColumn>(*column));
The raw `column` was stored as `_tablet_column` for direct getter access, but
the same pointer is available via the base class's `get_column()` (returning
`_column.get()`). Collapse to a single `TabletColumnPtr column` param and drop
the redundant `_tablet_column` member from both ColumnWriter subclasses; rewrite
their 6 method-body usages to call `get_column()`.
VariantColumnWriterImpl is the pimpl impl of VariantColumnWriter and is NOT a
ColumnWriter subclass, so it keeps its own `_tablet_column` member; the outer
VariantColumnWriter ctor now passes `get_column()` to the impl ctor.
Semantic change: the raw pointer formerly pointed to the caller's original
TabletColumn; now it points to the base class's owned copy (made via
`std::make_shared<TabletColumn>(*column)` at the call site). Lifetime-safe and
content-equivalent for read-only access.
3. olap_common.h: update stale comments
- FieldType enum: drop the dangling "Field" reference (StorageField was
removed by the parent commit, doris::Field is unrelated here).
- FieldAggregationMethod enum: restore the previous verbose comment style
with the class name updated to TabletColumn.
Contributor
TPC-H: Total hot run time: 29579 ms |
Contributor
TPC-DS: Total hot run time: 172234 ms |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 29501 ms |
Contributor
TPC-DS: Total hot run time: 171293 ms |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Drop the
StorageFieldwrapper and related dead code.StorageFieldwas a thin layer overTabletColumn— every accessor just forwarded, all 11 subclasses were empty stubs with no caller distinguishing them viadynamic_cast/typeid. After removing it, several dead pieces fell out.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)