feat: add Ollama text vectorizer#617
Conversation
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
|
Yess!! Let's go! Ollama finally |
There was a problem hiding this comment.
Pull request overview
Adds an Ollama-backed text embedding provider to RedisVL’s vectorizer system, enabling local embedding generation via an Ollama daemon while fitting into the existing vectorizer registry/from-dict plumbing and test suite.
Changes:
- Introduces
OllamaTextVectorizerwith sync/async embedding APIs and automatic dimension discovery. - Wires
type: "ollama"into theVectorizersenum andvectorizer_from_dict, and exports the new vectorizer fromredisvl.utils.vectorize. - Adds an
ollamaoptional extra (also included inall), updatesuv.lock, and adds mocked unit tests plus opt-in integration tests gated byREDISVL_TEST_OLLAMA=1.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
redisvl/utils/vectorize/text/ollama.py |
New Ollama vectorizer implementation (sync/async) and model-dimension initialization. |
redisvl/utils/vectorize/base.py |
Adds ollama to the Vectorizers enum. |
redisvl/utils/vectorize/__init__.py |
Exposes OllamaTextVectorizer and adds type: "ollama" support in vectorizer_from_dict. |
pyproject.toml |
Adds ollama optional extra and includes it in all. |
uv.lock |
Locks the new ollama dependency and updates extras metadata. |
tests/unit/test_ollama_vectorizer.py |
Mocked unit tests for init, sync/async embedding, and registry/from-dict integration. |
tests/integration/test_ollama_vectorizer_integration.py |
Opt-in real Ollama integration tests gated by env var. |
Comments suppressed due to low confidence (1)
redisvl/utils/vectorize/text/ollama.py:298
- Same validation issue as
_embed_many: the error message iteratescontentsto compute element types even whencontentsis not a list. Passing a non-iterable will raise during message formatting and obscure the actual input-type error; guard before iterating or simplify the message for non-list inputs.
if not isinstance(contents, list) or not all(
isinstance(c, str) for c in contents
):
raise TypeError(
f"Input contents must be a list of strings to embed, got {type(contents)} with elements of types {[type(c) for c in contents]}"
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 6d6873f. Configure here.
| embedding = self._embed("dimension check") | ||
| return len(embedding) | ||
| except (KeyError, IndexError) as ke: | ||
| raise ValueError(f"Unexpected response from the Ollama API: {str(ke)}") |
There was a problem hiding this comment.
Unreachable exception handler in _set_model_dims
Low Severity
The except (KeyError, IndexError) handler in _set_model_dims is dead code. The _embed method already catches all exceptions except ConnectionError and TypeError and wraps them as ValueError (lines 200–203). Any KeyError or IndexError from the Ollama API response (e.g., missing "embeddings" key or empty list) gets converted to ValueError inside _embed before it can propagate to _set_model_dims. Additionally, len() on the returned list cannot raise these either. The intended "Unexpected response" message will never be surfaced.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 6d6873f. Configure here.


Summary
Adds
OllamaTextVectorizersupport for local Ollama embedding models.This PR includes Ollama optional dependency wiring, a sync/async
OllamaTextVectorizerimplementation, vectorizer registry/from-dict support fortype: "ollama", mocked unit tests, and opt-in real Ollama integration tests guarded byREDISVL_TEST_OLLAMA=1.Usage
To run the real Ollama integration tests locally:
Testing:
uv run pytest tests/unit/test_ollama_vectorizer.py tests/integration/test_ollama_vectorizer_integration.py -q REDISVL_TEST_OLLAMA=1 uv run pytest tests/integration/test_ollama_vectorizer_integration.py -q uv run black --check ./redisvl ./tests uv run isort ./redisvl ./tests --check-only --profile black uv lock --check uv run mypy redisvl make testFull repo test result:

Note
Low Risk
Additive feature behind an optional extra; no changes to existing vectorizers or core Redis paths. Main operational risk is init-time calls to a local Ollama server when users instantiate the vectorizer.
Overview
Adds local Ollama as a first-class text embedding provider in RedisVL.
A new
OllamaTextVectorizertalks to a running Ollama server (sync/asyncembed/embed_many, optionalhost, dimension probing on init, retries, and input validation). It is exported fromredisvl.utils.vectorize, registered asVectorizers.ollama, and constructible viavectorizer_from_dictwithtype: "ollama". Optional install is wired asredisvl[ollama]/all(ollama>=0.5.4); the vectorizers user guide documents setup and usage.Tests: broad unit coverage with a fake
ollamaclient; opt-in integration tests whenREDISVL_TEST_OLLAMA=1and a pulled model are available.Reviewed by Cursor Bugbot for commit 6d6873f. Bugbot is set up for automated code reviews on this repo. Configure here.