Skip to content

RFC: Query Planner for Per-Filter-Bitmap Disk Search#1102

Open
YuanyuanTian-hh wants to merge 7 commits into
mainfrom
rfc_query_planner_v2
Open

RFC: Query Planner for Per-Filter-Bitmap Disk Search#1102
YuanyuanTian-hh wants to merge 7 commits into
mainfrom
rfc_query_planner_v2

Conversation

@YuanyuanTian-hh
Copy link
Copy Markdown
Contributor

  • Does this PR have a descriptive title that could go in our release notes?
  • Does this PR add any new dependencies?
  • Does this PR modify any existing APIs?
  • Is the change to the API backwards compatible?
  • Should this result in any changes to our documentation, either updating existing docs or adding new ones?

Reference Issues/PRs

What does this implement/fix? Briefly explain your changes.

Any other comments?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new RFC proposing a “Query Planner” that chooses between flat scan and beta-filtered disk graph search based on filter bitmap match rate and index size, aiming to improve recall/latency tradeoffs for per-filter searches.

Changes:

  • Adds an RFC describing the planner’s motivation, decision thresholds, and proposed APIs (QueryPlannerConfig, QueryPlanner, QueryStrategy).
  • Documents benchmark-driven threshold choices and outlines future extensions (adaptive beta, more strategies).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,346 @@
# Query Planner for Per-Filter-Bitmap Disk Search
Comment on lines +3 to +7
| | |
|---|---|
| **Authors** | tianyuanyuan |
| **Created** | 2026-05-25 |
| | |
Comment thread rfcs/01102-query-planner.md Outdated
Comment thread rfcs/01102-query-planner.md Outdated
Comment on lines +32 to +35
1. Provide a `QueryPlanner` that automatically selects the optimal search strategy (flat scan vs. beta-filtered graph search) based on index size and filter match rate.
2. Derive thresholds from benchmark experiments on datasets covering the 150K-10M index sizes.
3. Compose cleanly with the `SearchPlan` / `GraphMode` API from the [disk beta filter RFC](https://github.com/dyhyfu/DiskANN/blob/c3ae608683531765920f0844d70750efa731946a/rfcs/01101-disk-beta-filter.md) — the planner produces `SearchPlan` values, nothing else.
4. Allow callers to override thresholds via `QueryPlannerConfig` for tuning.
Comment on lines +196 to +207
pub fn plan(&self, matching_count: u64) -> QueryStrategy {
if self.total_points <= self.config.total_points_threshold {
return QueryStrategy::FlatSearch;
}

let match_rate = matching_count as f64 / self.total_points as f64;
if match_rate <= self.config.flat_search_threshold {
QueryStrategy::FlatSearch
} else {
QueryStrategy::BetaFilter
}
}
Comment thread rfcs/01102-query-planner.md Outdated
Comment on lines +209 to +218
/// Plan and produce a `SearchPlan` with the appropriate predicate wiring.
///
/// The caller provides the bitmap (as an `Arc<RoaringBitmap>`) and the
/// matching count. The planner selects the strategy and constructs the
/// `SearchPlan` with the closure already wired to the bitmap.
pub fn plan_search(
&self,
bitmap: Arc<RoaringBitmap>,
matching_count: u64,
) -> SearchPlan {
@YuanyuanTian-hh YuanyuanTian-hh added the RFC Request For Comments label May 25, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.46%. Comparing base (e3139b4) to head (a35c14e).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1102      +/-   ##
==========================================
- Coverage   89.49%   89.46%   -0.03%     
==========================================
  Files         479      482       +3     
  Lines       90337    91082     +745     
==========================================
+ Hits        80844    81491     +647     
- Misses       9493     9591      +98     
Flag Coverage Δ
miri 89.46% <ø> (-0.03%) ⬇️
unittests 89.11% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 8 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Yuanyuan Tian (from Dev Box) added 2 commits May 26, 2026 14:31
Replace total_points threshold with matching_count threshold based on
benchmark analysis across 150K-10M vectors. The hybrid approach
(matching_count <= 200K OR match_rate <= 25%) achieves 9/109 mistakes
vs 11/109 for the original total_points-based approach.

Key insight: flat scan latency depends on matching_count, not
total_points (for <=1M indexes). The beta recall dip zone is a
match-rate phenomenon (2-10%) that requires a rate-based guard.
Copy link
Copy Markdown
Contributor

@suri-kumkaran suri-kumkaran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the planner is a solid addition — the benchmark methodology is thorough, the hybrid threshold is well-justified against the data, and the routing layer cleanly composes with the beta-filter RFC's SearchPlan shape. Leaving inline comments on a few design points worth resolving before this lands.

One general note: several of the code examples carry small impl-detail issues (unnecessary Arc::clone() where ownership transfer would do, direct struct construction, hard-pinned bitmap type, etc.). These are distracting at the RFC stage — readers spend cycles on whether the code is right, rather than whether the design is right. Happy to handle these in the impl PR; the RFC just needs to nail the design decisions that need consensus.

Comment thread rfcs/01102-query-planner.md Outdated

### Background

When a query is scoped to a specific filter category, the caller extracts a list of items matching that category, maps them to DiskANN's internal vector IDs, and constructs a bitmap. The caller wraps the bitmap in a `Predicate` closure and passes it to the disk searcher.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 17 makes the caller responsible for mapping to internal IDs:

maps them to DiskANN's internal vector IDs, and constructs a bitmap

Once the beta-filter RFC flips Predicate to be keyed on external IDs (with translation inside DiskANN), the planner should follow the same contract — the caller just builds a bitmap of vertex IDs as seen from search outputs, no internal-ID mapping at the call site.

Suggested edit to line 17: drop the "internal vector IDs" phrasing. Bitmap is over external IDs; DiskANN translates internally at the predicate boundary.

Copy link
Copy Markdown
Contributor Author

@YuanyuanTian-hh YuanyuanTian-hh May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — dropped the 'internal vector IDs' phrasing.

Comment thread rfcs/01102-query-planner.md Outdated
/// `SearchPlan` with the closure already wired to the bitmap.
pub fn plan_search(
&self,
bitmap: Arc<RoaringBitmap>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan_search is pinned to RoaringBitmap, which contradicts the beta-filter RFC it builds on.

Line 219:

pub fn plan_search(&self, bitmap: Arc<RoaringBitmap>, matching_count: u64) -> SearchPlan

But line 48 in this RFC's own §Prerequisites says:

The Predicate closure can wrap any bitmap type (RoaringBitmap, BitSet, HashSet<u32>)

The whole point of Predicate = Box<dyn Fn(u32) -> bool> is bitmap-agnosticism — callers using BitSet, HashSet<u32>, or any custom backing structure can't use the planner as written.

Cleaner shape — take the predicate abstraction the beta-filter RFC already chose, with matching_count as a separate caller-provided hint (predicates aren't size-aware, so the count must come from outside regardless):

pub fn plan_search<F>(&self, predicate: F, matching_count: u64) -> SearchPlan
where F: Fn(u32) -> bool + Send + Sync + 'static + Clone,

The planner clones the predicate into whichever SearchPlan arm it picks. Bitmap implementation stays the caller's choice; the planner only cares about "is there a predicate and how many points does it match."

Copy link
Copy Markdown
Contributor Author

@YuanyuanTian-hh YuanyuanTian-hh May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — The planner is bitmap-agnostic — callers wrap any bitmap type into a closure before calling. Bounds match the beta-filter RFC's constructor signatures exactly.

Comment thread rfcs/01102-query-planner.md Outdated
Comment on lines +228 to +230
SearchPlan::FlatScan {
filter: Some(Box::new(move |id| bm.contains(id))),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan_search builds variants via direct struct construction (lines 228-230, 234-237):

SearchPlan::FlatScan { filter: Some(Box::new(move |id| bm.contains(id))) }
SearchPlan::Graph(GraphMode::BetaFilter { predicate: Box::new(...), beta })

This bypasses the constructors (SearchPlan::flat_filtered, GraphMode::beta_filter) that the beta-filter RFC introduced precisely to centralize predicate boxing and β ∈ (0, 1] validation. As written, QueryPlannerConfig::beta is unvalidated — a caller passing beta = 2.0 produces a plan that increases distances on matching vectors (semantic inversion), silently.

Use the constructors and propagate the Result:

QueryStrategy::FlatSearch =>
    Ok(SearchPlan::flat_filtered(move |id| bitmap.contains(id))),
 QueryStrategy::BetaFilter =>
    GraphMode::beta_filter(move |id| bitmap.contains(id), beta)
        .map(SearchPlan::graph_with),

plan_search returns Result<SearchPlan, BetaError>. Validation discipline established by the beta-filter RFC stays intact through the planner.

Copy link
Copy Markdown
Contributor Author

@YuanyuanTian-hh YuanyuanTian-hh May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — now uses SearchPlan::flat_filtered() and GraphMode::beta_filter() constructors.

to caller
```

### 3. Strategy Selection
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan_search only produces FlatScan or BetaFilterGraphMode::PostFilter and Graph::Unfiltered (and any future GraphMode variant) are unreachable through the planner.

Two questions worth answering explicitly:

  1. Why is PostFilter never selected? The beta-filter RFC keeps it as a first-class mode. If it's strictly dominated by BetaFilter at every (matching_count, match_rate) cell on the benchmark surface, say so — and consider deprecating it from the underlying API. If it's not dominated, the planner is missing a case.

  2. What's the extension story? When future graph algorithms land as new GraphMode variants (the beta-filter RFC explicitly leaves room — Multihop, etc.), how does the planner pick them up? Today every new variant needs a new threshold derivation and a new QueryStrategy arm. Worth at least sketching: does each GraphMode self-describe its viable (matching_count, match_rate) region, with the planner aggregating? Or does each new variant land with a planner update?

Even a brief §Future Work note ("planner currently covers two of three modes; PostFilter dominance to be validated") would close the gap. Right now a reader can't tell whether the omission is intentional or oversight.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the omission is intentional, not an oversight.

Why PostFilter is not selected: PostFilter applies the same hard post-filter as BetaFilter but without the beta bias on PQ distances. It is strictly dominated — BetaFilter does everything PostFilter does plus biases beam traversal toward matching vectors, improving recall at high match rates. At low match rates where beta bias hurts (the 2–10% dip zone), the planner routes to FlatScan instead of PostFilter, since flat scan has ~100% recall. There is no (matching_count, match_rate) cell in the benchmark data where PostFilter outperforms both FlatScan and BetaFilter.

Extension story: Each new GraphMode variant lands with its own threshold derivation and a new QueryStrategy arm — the planner update is part of the variant's integration. Self-describing viable regions is an interesting future direction but over-engineers the current two-strategy planner. Will add a Future Work note.

to caller
```

### 3. Strategy Selection
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current decision logic has a lower bound on flat scan (matching_count ≤ 200K → flat) but no upper bound. Consider: 10M index at 20% match rate. matching_count = 2M, match_rate < 25%, so the planner picks flat scan.

From Experiment 5: on a 10M index, flat scan at 1M matching vectors is ~970ms; at 5M it's ~3.4s. 2M matching extrapolates to ~1.5–2s. Latency tracks matching_count almost linearly — at 2M it's already 5× the ~315ms beta plateau.

Real workloads will have a latency budget. If a caller's budget is, say, 500ms, the planner shouldn't route to flat scan when matching_count is large enough that flat scan blows past the budget — even if the recall would be better. Accept the recall hit; meet the SLA.

A symmetric max_brute_force_matching_count (or, more precisely, a latency-budget-aware threshold) closes this gap:

if matching_count <= matching_count_threshold       → FlatScan
else if matching_count >= max_brute_force_count     → BetaFilter   // (new)
else if match_rate <= flat_search_threshold         → FlatScan
elseBetaFilter

Today's use cases may not hit it (smaller indexes), but the decision logic should cover the full surface so larger indexes don't need a re-derivation. Even a conservative default (e.g. max_brute_force_count = 1M) caps worst-case flat-scan latency at ~1s on a 10M index.

Copy link
Copy Markdown
Contributor Author

@YuanyuanTian-hh YuanyuanTian-hh May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added max_brute_force_count (default: 1M) as a third threshold:

\
if matching_count <= 200K → FlatScan
else if matching_count >= 1M → BetaFilter // NEW
else if match_rate <= 25% → FlatScan
else → BetaFilter
\\

Validated across 7 index sizes (1M–10M, dim=384). Where flat scan crosses 500ms:

{32DE80A4-347B-4FD0-B3CE-1AE03183EC38}

Yuanyuan Tian (from Dev Box) added 3 commits May 26, 2026 19:23
- Drop 'internal vector IDs' phrasing (comment 1)
- Generic predicate instead of Arc<RoaringBitmap> in plan_search (comment 2)
- Validate beta at QueryPlanner::new(), plan_search returns Result (comment 3)
- Use SearchPlan/GraphMode constructors instead of direct struct construction
Add third threshold (max_brute_force_count = 1M) that caps flat scan
latency. Validated across 7 index sizes (1M-10M, dim=384):
- 1M index: flat crosses 500ms at ~1M matching
- 3M index: flat crosses 500ms at ~800K matching
- 5M index: flat crosses 500ms at ~600K matching
- 10M index: flat crosses 500ms at ~100K matching

Default 1M is conservative for indexes up to ~3M; callers with
larger indexes can lower it via QueryPlannerConfig.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RFC Request For Comments

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants