Skip to content

feat: expose embedding/extraction model config in node-type registry#1174

Merged
pyramation merged 1 commit into
mainfrom
feat/embedding-model-config
May 16, 2026
Merged

feat: expose embedding/extraction model config in node-type registry#1174
pyramation merged 1 commit into
mainfrom
feat/embedding-model-config

Conversation

@pyramation
Copy link
Copy Markdown
Contributor

Summary

Adds optional embedding_model / embedding_provider parameters to 5 embedding-related node types, and extraction_model / extraction_provider to ProcessExtraction. This lets blueprint authors specify which model and provider to use per-node, rather than relying solely on runtime config.

All new parameters are optional strings with no defaults — when omitted (null), workers fall back to the existing resolution chain (llm_module → env vars).

Node types updated:

Node Type New Parameters
SearchVector embedding_model, embedding_provider
ProcessFileEmbedding embedding_model, embedding_provider
ProcessImageEmbedding embedding_model, embedding_provider
ProcessChunks embedding_model, embedding_provider
SearchUnified (embedding sub-config) embedding_model, embedding_provider
ProcessExtraction extraction_model, extraction_provider

Companion PR: The SQL generator and seed changes that consume these parameters are in constructive-db feat/embedding-model-config.

Review & Testing Checklist for Human

  • Verify the parameter names (embedding_model/embedding_provider vs extraction_model/extraction_provider) match what the constructive-db SQL generators extract from data->>'...'
  • Confirm SearchUnified.embedding nesting is correct — the new params should be siblings of field_name, dimensions, chunks, etc. inside the embedding properties object, not at the top level
  • Spot-check that no base_url parameter was accidentally included (intentionally excluded for billing bypass concerns)

Notes

  • This is a schema-only change (parameter definitions). No runtime behavior changes in this repo.
  • base_url was intentionally excluded per discussion — exposing it would allow users to bypass the inference billing system.

Link to Devin session: https://app.devin.ai/sessions/64aab8157bae43e69f70f33c193dc903
Requested by: @pyramation

- SearchVector: add embedding_model, embedding_provider params
- ProcessFileEmbedding: add embedding_model, embedding_provider params
- ProcessImageEmbedding: add embedding_model, embedding_provider params
- ProcessChunks: add embedding_model, embedding_provider params
- SearchUnified.embedding: add embedding_model, embedding_provider params
- ProcessExtraction: add extraction_model, extraction_provider params
- All params are optional, default to null (worker falls back to runtime config)
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@pyramation pyramation merged commit f5883d8 into main May 16, 2026
37 checks passed
@pyramation pyramation deleted the feat/embedding-model-config branch May 16, 2026 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant