Skip to content

store: Consolidate index creation using CreateIndex and add postponed index support#6434

Merged
lutter merged 12 commits intomasterfrom
lutter/create-index
May 5, 2026
Merged

store: Consolidate index creation using CreateIndex and add postponed index support#6434
lutter merged 12 commits intomasterfrom
lutter/create-index

Conversation

@lutter
Copy link
Copy Markdown
Collaborator

@lutter lutter commented Mar 11, 2026

Currently, GRAPH_POSTPONE_ATTRIBUTE_INDEX_CREATION when copying subgraphs; but subgraphs whose performance is limited by the speed at which we can write to the database, such as amp subgraphs, will also benefit from deferring index creation.

Besides postponing index creation for syncing subgraphs, this PR also refactors how indexes are created for subgraph tables, replacing scattered raw SQL generation with a unified CreateIndex abstraction.

This PR will also be the basis for speeding up graphman restore by having it defer index creation until after the data import.

Index creation consolidation

  • Introduce Table::indexes() returning all indexes (time-travel, attribute, aggregate) as structured CreateIndex objects instead of raw SQL strings
  • Parse and round-trip our own index definitions including BRIN indexes with minmax_multi_ops operator classes and various WHERE clauses
  • Remove the index_def: Option<IndexList> parameter threading and simplify callers across copy.rs, prune.rs, and deployment_store.rs
  • Add a create_index example tool for testing index definition parsing

Postponed index creation

  • Allow deferring index creation during initial sync via CreateIndex::to_postpone(), controlled by the GRAPH_POSTPONE_INDEXES env var
  • Trigger creation of postponed indexes when a subgraph gets within a configurable number of blocks of the chain head (GRAPH_POSTPONE_INDEXES_CREATION_THRESHOLD, default 10000)
  • Re-create any missing postponed indexes on subgraph restart as a safety net, using IF NOT EXISTS + CONCURRENTLY

Test plan

  • Unit tests pass (just test-unit)
  • DDL test constants updated to match new single-line index format
  • Verify postponed index creation triggers correctly near chain head
  • Verify indexes are recreated on subgraph restart

@fordN fordN requested a review from isum March 17, 2026 16:02
Copy link
Copy Markdown
Member

@isum isum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement!

One note:

  • Amp subgraphs have a separate runner so it should also be updated to trigger the postponed index creation when a subgraph reaches the chain head

@lutter lutter force-pushed the lutter/create-index branch from cf2f440 to bf31ad4 Compare May 5, 2026 22:22
lutter added 3 commits May 5, 2026 16:32
Consolidate index creation into a single `Table::indexes()` method that
returns all indexes (time-travel, attribute, aggregate) as `CreateIndex`
objects. This replaces the old string-based methods and eliminates the
`index_def: Option<IndexList>` parameter threading through the codebase.

Key changes:
- Add `Table::indexes()` combining time_travel + attribute + aggregate indexes
- Add `attr_index_spec()` and `add_attribute_indexes()` structured helpers
- Move env var check into `CreateIndex::to_postpone()` so callers need not check
- Simplify `Table::as_ddl()` to iterate indexes with postpone filtering
- Remove old `create_time_travel_indexes`, `create_attribute_indexes`,
  `create_postponed_indexes`, `create_aggregate_indexes` string methods
- Remove `index_def` parameter from Layout, DeploymentStore, SubgraphStore
- Update copy.rs to use `indexes()` + `references_column_not_in()` for new fields
- Update prune.rs to use simplified `as_ddl()` without index_def
- Update all DDL test constants for new single-line index format
Add a trigger that creates postponed indexes when a subgraph gets
within a configurable number of blocks (default 10000) of the chain
head. This ensures indexes are in place before the subgraph starts
serving queries.

The new env var GRAPH_POSTPONE_INDEXES_CREATION_THRESHOLD controls
how many blocks before the chain head to trigger index creation. The
creation is idempotent (IF NOT EXISTS + CONCURRENTLY) and only
attempted once per subgraph run via an AtomicBool guard.
Replace the IndexList-based `recreate_invalid_indexes` call in
`start_subgraph()` with a call to `create_postponed_indexes()`. This
uses `IF NOT EXISTS` and `CONCURRENTLY` to safely create any missing
postponed indexes on every restart, acting as a safety net.

Remove the now-unused `IndexList::recreate_invalid_indexes` method.
@lutter lutter force-pushed the lutter/create-index branch from bf31ad4 to 6e8f43e Compare May 5, 2026 23:33
@lutter lutter merged commit 6e8f43e into master May 5, 2026
6 checks passed
@lutter lutter deleted the lutter/create-index branch May 5, 2026 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants