Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,23 @@ and the honest caveats — is in [`docs/eval.md`](docs/eval.md).
- **Bring your own model platform.** Built for self-hosted and local models first (any OpenAI-compatible server — Ollama, vLLM, LM Studio, LocalAI), with first-class **OpenAI API** and **Anthropic API** support when you want it.
- **Symbol-aware chunking.** Indexes *functions, classes, and methods* (Python via `ast`; JS/TS/Go/Rust/Java via tree-sitter), not crude fixed-size blocks — so results point at real code units with `file:line` citations.
- **Hybrid retrieval, with optional reranking.** Dense vector search **+** BM25 keyword search, fused with Reciprocal Rank Fusion — great at both "what does this *mean*" and exact-identifier lookups. Add an optional local **cross-encoder reranker** (two-stage retrieve-then-rerank, `CODERAG_RERANK=1`, no API key) to sharpen the top results.
- **Drop-in for AI coding agents.** An **MCP server** (`coderag mcp`) lets Claude Code, Codex, and Cursor search a warm, pre-indexed workspace instead of slow grep/glob/read loops — ranked `path:line` results from a single call, with the index kept live as you edit. Works on a plain file directory too, not just code.
- **Semantic *and* exact search in one tool belt.** Agents get `search_code` (hybrid, by meaning) **and** `search_files` (ripgrep-backed exact regex/glob, the literal-match complement) — plus loop-detection, pagination, and line-numbered reads with "did you mean?" hints. No more grep/glob/find loops.
- **Drop-in for AI coding agents — one command.** `coderag install` wires the **MCP server** into **Claude Code**, **Hermes**, and **Codex** (auto-detect or an interactive wizard, idempotent, with backups) so they search a warm, pre-indexed workspace instead of slow grep/glob/read loops — ranked `path:line` results from a single call, index kept live as you edit. Works on a plain file directory too, not just code.
- **Measured, not guessed.** A built-in **evaluation harness** (`coderag eval`) scores retrieval quality — recall@k, MRR, nDCG@k at file *or* symbol level — and can mine a benchmark straight from your git history. Every default (1:1 hybrid, reranker opt-in, adaptive fusion off) is the choice the harness validated, including across an external repo.
- **Incremental & live.** Content-hashed indexing only re-embeds files that changed; a debounced watcher keeps the index current as you code. No duplicate or stale vectors.
- **Built to scale.** Exact `Flat` search for small repos, automatic switch to approximate `IVF` past a threshold so it stays fast at 100k+ chunks.
- **Five surfaces, one engine.** CLI · Python library · HTTP/REST · web UI · MCP server — all thin wrappers over the same `CodeRAG` object.

### ⚡ One line: install + wire into your agent

Download, install, and register CodeRAG as your coding agent's search tool — in a single command:

```bash
pip install "coderag[mcp] @ git+https://github.com/Neverdecel/CodeRAG" && coderag install
```

This installs the engine **and** MCP server, then `coderag install` auto-detects **Claude Code**, **Hermes**, and **Codex** (or launches an interactive wizard) and wires CodeRAG in. Restart your agent and it searches a warm, pre-indexed workspace instead of grepping. Preview without writing any config: `coderag install --print`.

## 🚀 Quick start

```bash
Expand Down
Loading