Vectorize MCP Worker (Python)

Production-Grade Hybrid RAG with Multimodal Support on Cloudflare Edge in Python.

Features

Hybrid Search (Vector + BM25) with Reciprocal Rank Fusion
Multimodal Image Processing (Llama 4 Scout)
Cross-Encoder Reranking (bge-reranker-base)
Recursive Chunking with 15% overlap
One-time License System
Interactive Dashboard at /dashboard
MCP tool integration (via vectorize-mcp-tool package)

Setup

Prerequisites

A Cloudflare account with Workers, Vectorize, D1, and Workers AI enabled
uv (Python package manager)
wrangler (Cloudflare CLI) -- needed to create cloud resources

Install uv

macOS:

brew install uv

Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Install Wrangler (Cloudflare CLI)

Wrangler is needed to create and manage Cloudflare resources (D1 databases, Vectorize indexes, secrets). Install it globally via npm:

macOS:

brew install node          # if you don't have Node.js
npm install -g wrangler

Linux (Debian/Ubuntu):

curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -
sudo apt-get install -y nodejs
npm install -g wrangler

Linux (any distro, via nvm):

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
source ~/.bashrc   # or restart your shell
nvm install --lts
npm install -g wrangler

Then authenticate with your Cloudflare account:

wrangler login

This opens a browser window to authorize the CLI against your account.

Install Python Dependencies

uv init
uv tool install workers-py

Create Cloudflare Resources

The project requires three Cloudflare services: Vectorize (vector database), D1 (SQL database), and Workers AI (inference). Workers AI is enabled automatically; the other two need to be created. A fourth service, multimodal-pro-worker, is optional and enables image features.

1. Create the Vectorize Index

This creates the vector store used for semantic search with 384-dimension BGE embeddings:

wrangler vectorize create mcp-knowledge-base --dimensions=384 --metric=cosine

2. Create the D1 Database

D1 stores document metadata, BM25 keyword indexes, and license records:

wrangler d1 create mcp-knowledge-db

This outputs a database ID. Copy it and update wrangler.toml:

[[d1_databases]]
binding = "DB"
database_name = "mcp-knowledge-db"
database_id = "paste-your-database-id-here"

3. Apply the Database Schema

Run the schema migration against the remote D1 database:

wrangler d1 execute mcp-knowledge-db --remote --file=./schema.sql

This creates the following tables:

Table	Purpose
`documents`	Ingested document chunks and metadata
`keywords`	BM25 term-frequency index per document
`doc_stats`	Corpus-level statistics (total docs, avg length)
`term_stats`	Document-frequency per term
`licenses`	API license keys and quotas

4. Set the API Key Secret

Set the API key that protects write operations (/ingest/document, /license/*):

wrangler secret put API_KEY

You will be prompted to enter the secret value interactively. This is stored encrypted and never appears in wrangler.toml.

5. Deploy the Multimodal Worker (optional -- for image features)

The image endpoints (/ingest/image, /search/similar-images) require a separate worker that processes images via Llama 4 Scout. If you only need text search, skip this step.

cd multimodal-pro-worker
uv tool install workers-py    # if not already installed
uv run pywrangler deploy
cd ..

This deploys the multimodal-pro-worker which the main worker calls via a Service Binding. Without it, text features work normally and image endpoints return a 501 error with a clear message.

Deploy and Debug

Build and deploy to Cloudflare's global edge network:

uv run pywrangler deploy

Stream live logs from the deployed worker:

wrangler tail --format=json

For a complete step-by-step guide, see docs/quickstart.md. For production deployment, security, and monitoring, see docs/production.md.

API Endpoints

Endpoint	Method	Description
`/`	GET	API documentation
`/health/check`	GET	Health check
`/dashboard`	GET	Interactive playground UI
`/llms.txt`	GET	AI search engine info
`/stats/index`	GET	Index statistics
`/search/multimodal`	POST	Hybrid search (documents + images)
`/search/documents`	POST	Documents-only search
`/search/similar-images`	POST	Find similar images
`/ingest/document`	POST	Document ingestion
`/ingest/image`	POST	Image ingestion (requires multimodal-pro-worker)
`/get/document/:id`	GET	Get document by ID
`/get/image/:id`	GET	Get image by ID
`/list/documents`	GET	List documents
`/delete/document/:id`	DELETE	Delete document
`/delete/license/:key`	DELETE	Delete license
`/license/validate`	POST	Validate license
`/license/create`	POST	Create license
`/license/list`	GET	List licenses
`/license/revoke`	POST	Revoke license
`/init/reset-passphrase`	POST	Set passphrase for reset endpoints
`/reset/all`	POST	Reset all (passphrase-gated)
`/reset/documents`	POST	Reset documents (passphrase-gated)
`/reset/licenses`	POST	Reset licenses (passphrase-gated)

Search results return snippets and metadata instead of full content. The passphrase-gated reset endpoints (/init/reset-passphrase, /reset/*) prevent accidental AI deletion.

MCP Integration

All MCP operations are performed through the vectorize-mcp-tool package, which provides both a CLI and a FastMCP stdio server. The MCP tool dispatches directly to the worker's REST endpoints -- there are no /mcp/* proxy endpoints on the worker.

Operations: search_multimodal, search_documents, ingest, ingest_image, stats, delete, get_document, get_image, list_documents, license_validate, license_create, license_list, license_revoke, delete_license, reset_all, reset_documents, reset_licenses.

Install and Use

# Install
cd vectorize-mcp-tool && pip install -e .

# CLI usage
vectorize-mcp --url https://your-worker.workers.dev --api-key YOUR_KEY health
vectorize-mcp --url https://your-worker.workers.dev --api-key YOUR_KEY search multimodal "your query"

# MCP server for Cursor
vectorize-mcp-server  # reads VECTORIZE_URL and VECTORIZE_API_KEY from env

See vectorize-mcp-tool/README.md for full documentation.

Project Structure

vectorize-mcp-worker-python/
├── src/                        # Main worker source
│   ├── entry.py                # HTTP routing and Worker entrypoint
│   ├── bindings/               # Cloudflare binding wrappers (FFI)
│   ├── hybrid_search.py        # Vector + BM25 + RRF fusion
│   ├── ingestion.py            # Document/image ingestion pipeline
│   └── ...
├── multimodal-pro-worker/      # Separate worker for image processing
│   ├── src/entry.py            # Llama 4 Scout vision + OCR + embedding
│   └── wrangler.toml           # Independent worker config
├── vectorize-mcp-tool/         # CLI + MCP server package
│   ├── src/vectorize_mcp_tool/ # Client, server, CLI
│   └── tests/                  # MCP tool unit tests
├── tests/                      # Test suite
│   ├── unit/                   # Unit tests (no network)
│   ├── integration/            # Contract tests (worker <-> MCP tool sync)
│   └── e2e/                    # E2E tests + benchmarks (live worker)
├── schema.sql                  # D1 database DDL
├── wrangler.toml.example       # Template config (copy to wrangler.toml)
└── docs/                       # All documentation

Testing

# Unit tests (no network, fast)
uv run pytest tests/unit/ --cov=src --cov-report=term-missing

# Integration/contract tests (no network)
uv run pytest tests/integration/

# MCP tool tests
cd vectorize-mcp-tool && uv run pytest tests/

# E2E tests (requires live worker)
VECTORIZE_E2E_URL=https://... VECTORIZE_E2E_API_KEY=... uv run pytest tests/e2e/ -m "not benchmark"

# Benchmarks (persists results, detects regressions)
VECTORIZE_E2E_URL=https://... VECTORIZE_E2E_API_KEY=... uv run pytest tests/e2e/ -m benchmark

Contract tests in tests/integration/ verify that the worker and MCP tool stay in sync. Any endpoint change in the worker that isn't reflected in the MCP tool will fail these tests.

Architecture

See docs/ for detailed documentation:

docs/quickstart.md -- Step-by-step first-time setup and endpoint testing
docs/production.md -- Production deployment, security, monitoring, operations
docs/port_decisions.md -- Technology choices and rationale
docs/component_mapping.md -- TypeScript to Python mapping
docs/abstraction_layers.md -- Protocol design and FFI patterns

Credits

Original TypeScript implementation by Daniel Nwaneri.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.cursor/skills		.cursor/skills
docs		docs
multimodal-pro-worker/src		multimodal-pro-worker/src
scripts		scripts
src		src
tests		tests
vectorize-mcp-tool		vectorize-mcp-tool
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
pyproject.toml		pyproject.toml
schema.sql		schema.sql
wrangler.toml.example		wrangler.toml.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vectorize MCP Worker (Python)

Table of Contents

Features

Setup

Prerequisites

Install uv

Install Wrangler (Cloudflare CLI)

Install Python Dependencies

Create Cloudflare Resources

1. Create the Vectorize Index

2. Create the D1 Database

3. Apply the Database Schema

4. Set the API Key Secret

5. Deploy the Multimodal Worker (optional -- for image features)

Deploy and Debug

API Endpoints

MCP Integration

Install and Use

Project Structure

Testing

Architecture

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vectorize MCP Worker (Python)

Table of Contents

Features

Setup

Prerequisites

Install uv

Install Wrangler (Cloudflare CLI)

Install Python Dependencies

Create Cloudflare Resources

1. Create the Vectorize Index

2. Create the D1 Database

3. Apply the Database Schema

4. Set the API Key Secret

5. Deploy the Multimodal Worker (optional -- for image features)

Deploy and Debug

API Endpoints

MCP Integration

Install and Use

Project Structure

Testing

Architecture

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages