Neverdecel · Neverdecel · Jun 18, 2026 · Jun 18, 2026 · Jun 18, 2026 · Jun 18, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -5,7 +5,7 @@
 - `coderag/config.py`, `coderag/types.py`: Immutable `Config` and shared dataclasses.
 - `coderag/embeddings/`: `EmbeddingProvider` protocol + `fastembed` (default), `openai`, `fake`.
 - `coderag/chunking/`: Symbol-aware chunking (`python_ast.py`, `treesitter.py`, line-window `base.py`).
-- `coderag/store/`: `sqlite_store.py` (source of truth + FTS5) and `vector_index.py` (FAISS Flat/IVF cache).
+- `coderag/store/`: `lance_store.py` — a single embedded LanceDB store (chunk metadata, BM25, and vectors).
 - `coderag/retrieval/`: Hybrid dense + BM25 search fused with RRF.
 - `coderag/indexer.py`, `coderag/watch.py`: Incremental indexing and the debounced watcher.
 - `coderag/_ignore.py`: Shared ignore-glob matching used by both the indexer and `fs_search`.
@@ -29,10 +29,10 @@
 - First-party module is `coderag`; surfaces must stay thin — no engine logic in `surfaces/`.
 
 ## Architecture Invariants
-- SQLite is the source of truth; the FAISS index is a rebuildable cache (`rebuild_from_store`).
-- `chunks.id` is the FAISS id and is `AUTOINCREMENT` (ids never reused).
-- Incremental indexing is delete-before-add (no duplicate/stale vectors); unchanged files skip via content hash.
-- Embedding dimension comes from the provider, not a constant; a model change triggers a rebuild.
+- One embedded LanceDB store holds metadata + BM25 + vectors; it's rebuildable by re-indexing from source.
+- `chunks.id` is a store-managed integer id used as the fusion/hydrate key.
+- Incremental indexing is delete-before-add (no duplicate/stale rows); unchanged files skip via size+mtime then content hash.
+- Embedding dimension comes from the provider, not a constant; a model change clears the store for a clean re-index.
 
 ## Testing Guidelines
 - Place tests in `tests/` as `test_*.py`; keep them deterministic and offline (use the `fake` provider fixture).

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
@@ -26,29 +26,28 @@ coderag/
 ├── llm.py              # Optional streamed LLM answer over retrieved chunks
 ├── embeddings/         # EmbeddingProvider protocol + fastembed / openai / fake
 ├── chunking/           # Symbol-aware chunking: python_ast, treesitter, line-window base
-├── store/              # SQLite source of truth + pluggable FAISS vector index
-│   ├── sqlite_store.py #   files/chunks/vectors + FTS5 lexical search
-│   └── vector_index.py #   FaissVectorIndex: Flat (exact) / IVF (scale)
+├── store/              # Single embedded LanceDB store
+│   └── lance_store.py  #   files/chunks + BM25 (FTS) + vectors (ANN) in one place
 ├── retrieval/          # Hybrid search: dense + BM25, fused with RRF
 └── surfaces/           # cli.py · http_api.py (FastAPI) · webui.py · mcp_server.py (MCP)
 ```
 
 ### Design invariants (don't break these)
 
-- **SQLite is the source of truth; FAISS is a rebuildable cache.** Vectors are stored as
-  BLOBs in SQLite, so `FaissVectorIndex.rebuild_from_store()` can always reconstruct the
-  index. `ensure_consistent()` does this automatically when counts disagree.
-- **`chunks.id` is the FAISS id and is `AUTOINCREMENT`** — ids are never reused, which keeps
-  a stale cache from resurrecting deleted content.
-- **Delete-before-add.** A changed file's old chunks are removed from both SQLite and FAISS
-  before new ones are added (`Indexer._write`). This is the bug the old `monitor.py` had.
+- **One LanceDB store holds everything** (chunk metadata, text/BM25, and vectors/ANN). It is
+  rebuildable from source: re-indexing recreates it, and a `--full` pass clears and rebuilds.
+- **`chunks.id` is a store-managed integer id** used as the fusion/hydrate key; ids are not
+  reused within a run.
+- **Delete-before-add.** A changed file's old rows are removed before new ones are added
+  (`Indexer._write` → `LanceStore.write_file(replace=True)`), so editing never accumulates
+  stale or duplicate rows.
 - **The embedding dimension comes from the provider**, never a hard-coded constant. A model
-  change is detected via `meta.embed_dim` and triggers a clean rebuild.
+  change is detected via the store's `meta.json` and clears the store for a clean re-index.
 - **Writes serialize; reads don't block.** All indexing/deletion goes through one lock on the
-  `CodeRAG` facade (`_index_lock`), and `FaissVectorIndex` guards its own add/remove/search/
-  rebuild — so the MCP server's background index and live watcher run safely alongside
+  `CodeRAG` facade (`_index_lock`); the store buffers writes on the writer and reads query
+  committed data — so the MCP server's background index and live watcher run safely alongside
   concurrent agent searches. Indexing may parallelize chunk+embed across `index_workers`
-  threads, but the SQLite/FAISS writes stay single-writer (`Indexer._write`).
+  threads, but the store writes stay single-writer (`Indexer._write`).
 
 ## Quality gate
 

diff --git a/README.md b/README.md
@@ -39,7 +39,7 @@ Coding agents like Claude Code and Codex locate code by *running searches* — g
 repeat — which burns tokens and round-trips and reduces to literal keyword matching. CodeRAG
 turns the workspace into a **warm, pre-indexed** engine: a single query returns the right
 functions and files ranked by **meaning *and* keyword**, with exact `path:line` citations. The
-embedding model loads once, so each query is one in-process lookup (FAISS + BM25 + fusion), not
+embedding model loads once, so each query is one in-process lookup (vector ANN + BM25 + fusion), not
 a multi-round shell loop — and over MCP (`coderag mcp`, below) it becomes the agent's search tool.
 
 **Proof from the eval harness** — this repo's 24 natural-language → file queries (90 files /
@@ -72,7 +72,7 @@ and the honest caveats — is in [`docs/eval.md`](docs/eval.md).
 - **Drop-in for AI coding agents — one command.** `coderag install` wires the **MCP server** into **Claude Code**, **Hermes**, and **Codex** (auto-detect or an interactive wizard, idempotent, with backups) so they search a warm, pre-indexed workspace instead of slow grep/glob/read loops — ranked `path:line` results from a single call, index kept live as you edit. Works on a plain file directory too, not just code.
 - **Measured, not guessed.** A built-in **evaluation harness** (`coderag eval`) scores retrieval quality — recall@k, MRR, nDCG@k at file *or* symbol level — and can mine a benchmark straight from your git history. Every default (1:1 hybrid, reranker opt-in, adaptive fusion off) is the choice the harness validated, including across an external repo.
 - **Incremental & live.** Content-hashed indexing only re-embeds files that changed; a debounced watcher keeps the index current as you code. No duplicate or stale vectors.
-- **Built to scale.** Exact `Flat` search for small repos, automatic switch to approximate `IVF` past a threshold so it stays fast at 100k+ chunks.
+- **Built to scale.** An embedded [LanceDB](https://github.com/lancedb/lancedb) store: brute-force exact search for small repos, automatic ANN indexing past a threshold so it stays fast at 100k+ chunks.
 - **Five surfaces, one engine.** CLI · Python library · HTTP/REST · web UI · MCP server — all thin wrappers over the same `CodeRAG` object.
 
 ### ⚡ One line: install + wire into your agent
@@ -158,7 +158,7 @@ from coderag import CodeRAG, Config
 cr = CodeRAG(Config.from_env(watched_dir="/path/to/repo"))
 cr.index()
 
-for hit in cr.search("how is the FAISS index persisted?"):
+for hit in cr.search("how is the vector index persisted?"):
     print(f"{hit.location}  {hit.symbol}  (sim={hit.similarity:.2f})")
     print(hit.text)
 ```
@@ -197,7 +197,7 @@ See it live (read-only, indexing this repo): **<https://coderag-ui.neverdecel.co
 Tools like Claude Code and Codex locate code with iterative `grep`/`glob`/read loops. CodeRAG
 exposes the same workspace as a **Model Context Protocol** server, so an agent gets fast,
 ranked `path:line` results from a single call against a **warm, pre-indexed** workspace — the
-embedding model loads once and every query is then one in-process lookup (FAISS + BM25 +
+embedding model loads once and every query is then one in-process lookup (vector ANN + BM25 +
 fusion), not a multi-round shell search.
 
 ```bash
@@ -327,21 +327,20 @@ scheduled reindex — in [`deploy/README.md`](deploy/README.md).
 graph LR
     A[Source files] --> B[Symbol-aware chunking<br/>ast / tree-sitter]
     B --> C[Embeddings<br/>fastembed · OpenAI · self-hosted]
-    C --> D[(SQLite store<br/>chunks + vectors + FTS5)]
-    D --> E[FAISS index<br/>Flat → IVF]
+    C --> D[(LanceDB store<br/>chunks + vectors + BM25)]
     Q[Query] --> F[Dense + BM25]
-    E --> F
     D --> F
     F --> G[Reciprocal Rank Fusion]
     G --> H[Ranked hits<br/>path:line + score]
 ```
 
-- **SQLite is the source of truth** (chunk text, line ranges, symbols, content hashes, and the
-  raw vectors). The **FAISS index is a rebuildable cache** — it can always be reconstructed
-  from SQLite, so switching models or index types never corrupts your data.
-- Each file's content is **hashed**; unchanged files are skipped on re-index. A changed file's
-  old chunks are removed from *both* the store and the vector index **before** new ones are
-  added — so editing never accumulates stale or duplicate vectors.
+- **One embedded LanceDB store** holds everything — chunk text, line ranges, symbols, content
+  hashes, the vectors (ANN), and the BM25 index — so there is no separate cache to keep in
+  sync. The store is also a rebuildable view of your code: it can always be re-indexed from
+  source, so switching embedding models never corrupts your data.
+- Each file's content is **hashed**; unchanged files are skipped on re-index (a cheap
+  size+mtime check avoids even reading them). A changed file's old chunks are removed
+  **before** new ones are added — so editing never accumulates stale or duplicate vectors.
 
 ## ⚙️ Configuration
 
@@ -372,7 +371,7 @@ no API key needed for a local server:
 ollama serve && ollama pull llama3.1
 export OPENAI_BASE_URL=http://localhost:11434/v1   # Ollama's OpenAI-compatible endpoint
 export CODERAG_CHAT_MODEL=llama3.1
-coderag search "how is the FAISS index persisted" --answer   # answer written locally
+coderag search "how is the vector index persisted" --answer   # answer written locally
 ```
 
 ### Common settings
@@ -385,9 +384,7 @@ table is in [`docs/configuration.md`](docs/configuration.md).
 | `CODERAG_PROVIDER` | `fastembed` | Embedding backend: `fastembed` (local) · `openai` (OpenAI API **or** any OpenAI-compatible/local server) · `fake` |
 | `CODERAG_MODEL` | `BAAI/bge-small-en-v1.5` | Local embedding model (`coderag eval --list-models`) |
 | `CODERAG_WATCHED_DIR` | cwd | Codebase to index |
-| `CODERAG_STORE_DIR` | `./.coderag` | Where the DB + index live |
-| `CODERAG_INDEX_TYPE` | `auto` | `auto` · `flat` · `ivf` |
-| `CODERAG_IVF_THRESHOLD` | `50000` | Vectors before switching Flat → IVF |
+| `CODERAG_STORE_DIR` | `./.coderag` | Where the LanceDB store lives |
 | `CODERAG_TOP_K` | `8` | Results returned |
 | `OPENAI_BASE_URL` | – | Point at a self-hosted / local OpenAI-compatible server (Ollama, vLLM, LM Studio, LocalAI) — enables local embeddings **and** local answers |
 | `OPENAI_API_KEY` | – | OpenAI **cloud** embeddings / answers (optional for a local server) |
@@ -431,7 +428,7 @@ Apache License 2.0 — see [LICENSE](LICENSE).
 
 ## 🙏 Acknowledgments
 
-[FAISS](https://github.com/facebookresearch/faiss) · [fastembed](https://github.com/qdrant/fastembed) ·
+[LanceDB](https://github.com/lancedb/lancedb) · [fastembed](https://github.com/qdrant/fastembed) ·
 [tree-sitter](https://tree-sitter.github.io/tree-sitter/) · [FastAPI](https://fastapi.tiangolo.com/) ·
 [Jinja](https://jinja.palletsprojects.com/) · [Pygments](https://pygments.org/) · [watchdog](https://github.com/gorakhargosh/watchdog)
 

diff --git a/coderag/__init__.py b/coderag/__init__.py
@@ -18,7 +18,7 @@
 
 if TYPE_CHECKING:
     # Re-exported lazily at runtime via __getattr__ below (keeps ``import coderag``
-    # light — no faiss/fastembed pulled in at import). Declared here only so type
+    # light — no lancedb/fastembed pulled in at import). Declared here only so type
     # checkers and static analysis see ``CodeRAG`` as a defined export of __all__.
     from coderag.api import CodeRAG
 
@@ -28,7 +28,7 @@
 
 
 def __getattr__(name: str) -> object:
-    # Lazy re-export so ``import coderag`` stays light (no faiss/fastembed at import).
+    # Lazy re-export so ``import coderag`` stays light (no lancedb/fastembed at import).
     if name == "CodeRAG":
         from coderag.api import CodeRAG
 

diff --git a/coderag/_ignore.py b/coderag/_ignore.py
@@ -1,16 +1,23 @@
-"""Shared ignore-glob matching for indexing and exact filesystem search.
+"""Shared file-walking + ignore matching for indexing and exact filesystem search.
 
 Both the :class:`~coderag.indexer.Indexer` and the exact filesystem search
-(:mod:`coderag.fs_search`) must skip the *same* set of paths — vendored deps, VCS
-directories, build output — or the two would disagree about what "the workspace" is.
-The matching rule lives here so both callers stay in lock-step instead of each
-re-implementing it.
+(:mod:`coderag.fs_search`) must enumerate the *same* set of paths — skipping vendored
+deps, VCS directories, build output, and (optionally) anything matched by ``.gitignore`` —
+or the two would disagree about what "the workspace" is. The single :func:`walk_files`
+generator below is the one place that decision is made, so both callers stay in lock-step.
 """
 
 from __future__ import annotations
 
 import fnmatch
-from typing import Iterable, Set
+import logging
+import os
+from pathlib import Path
+from typing import Iterable, Iterator, List, Optional, Set, Tuple
+
+logger = logging.getLogger(__name__)
+
+GITIGNORE_FILE = ".gitignore"
 
 
 def ignore_dir_names(ignore_globs: Iterable[str]) -> Set[str]:
@@ -33,3 +40,117 @@ def is_ignored(rel: str, ignore_globs: Iterable[str], ignore_dirs: Set[str]) ->
     if ignore_dirs.intersection(parts):
         return True
     return any(fnmatch.fnmatch(rel, g) for g in ignore_globs)
+
+
+def _is_ancestor(base: str, dir_rel: str) -> bool:
+    """Whether a ``.gitignore`` at ``base`` still applies at ``dir_rel`` (``""`` = root)."""
+    if base == "":
+        return True
+    return dir_rel == base or dir_rel.startswith(base + "/")
+
+
+class _GitignoreMatcher:
+    """Honor nested ``.gitignore`` files during a top-down walk (nearest rule wins).
+
+    A ``.gitignore`` at directory ``B`` scopes its patterns to paths under ``B``; the
+    closest file's rules take precedence and may re-include via ``!``. We keep a stack of
+    ``(base_rel, spec)`` ordered root→leaf, trimmed to the current directory's ancestors as
+    the (DFS pre-order) walk moves, and test a path nearest-first using pathspec's
+    tri-state ``check_file`` (ignore / negated-include / no-match). A no-op if pathspec is
+    somehow unavailable, so indexing never hard-fails on a missing optional dependency.
+    """
+
+    def __init__(self) -> None:
+        try:
+            from pathspec import GitIgnoreSpec
+        except ImportError:  # pragma: no cover - pathspec is a declared dependency
+            logger.warning(
+                "pathspec not installed; .gitignore files will not be honored."
+            )
+            self._spec_cls = None
+        else:
+            self._spec_cls = GitIgnoreSpec
+        self._stack: List[Tuple[str, object]] = []
+
+    @property
+    def enabled(self) -> bool:
+        return self._spec_cls is not None
+
+    def enter(self, dir_rel: str, dir_abs: Path) -> None:
+        """Refresh the active-rule stack for ``dir_rel`` and load its ``.gitignore``."""
+        if self._spec_cls is None:
+            return
+        # Drop rules from sibling subtrees we've left; keep only ancestors of dir_rel.
+        self._stack = [
+            (base, spec) for base, spec in self._stack if _is_ancestor(base, dir_rel)
+        ]
+        try:
+            text = (dir_abs / GITIGNORE_FILE).read_text(
+                encoding="utf-8", errors="replace"
+            )
+        except OSError:
+            return  # no .gitignore here (or unreadable)
+        self._stack.append((dir_rel, self._spec_cls.from_lines(text.splitlines())))
+
+    def match(self, rel: str, *, is_dir: bool) -> bool:
+        """True if ``rel`` (root-relative POSIX) is ignored by the active rules."""
+        if not self._stack:
+            return False
+        suffix = "/" if is_dir else ""
+        for base, spec in reversed(self._stack):
+            sub = rel if base == "" else rel[len(base) + 1 :]
+            result = spec.check_file(sub + suffix)  # type: ignore[attr-defined]
+            if result.include is not None:
+                return bool(result.include)
+        return False
+
+
+def walk_files(
+    start: Path,
+    ignore_globs: Iterable[str],
+    *,
+    root: Optional[Path] = None,
+    use_gitignore: bool = True,
+) -> Iterator[Tuple[Path, str]]:
+    """Yield ``(absolute_path, posix_rel)`` for every non-ignored file under ``start``.
+
+    ``rel`` is relative to ``root`` (defaults to ``start``) so every caller shares one
+    notion of the workspace. Ignored directories are pruned *before descending* (the big
+    win at ``/home`` scale), honoring ``ignore_globs`` (dir-name prune + path globs) and,
+    when ``use_gitignore``, nested ``.gitignore`` files.
+    """
+    start = Path(start)
+    root = Path(root) if root is not None else start
+    globs = tuple(ignore_globs)
+    ignore_dirs = ignore_dir_names(globs)
+    matcher = _GitignoreMatcher() if use_gitignore else None
+    active = matcher if (matcher is not None and matcher.enabled) else None
+
+    for dirpath, dirnames, filenames in os.walk(start):
+        d_abs = Path(dirpath)
+        try:
+            d_rel = "" if d_abs == root else d_abs.relative_to(root).as_posix()
+        except ValueError:  # pragma: no cover - start outside root
+            continue
+        if active is not None:
+            active.enter(d_rel, d_abs)
+
+        kept: List[str] = []
+        for name in dirnames:
+            if name in ignore_dirs:
+                continue
+            rel = name if d_rel == "" else f"{d_rel}/{name}"
+            if is_ignored(rel, globs, ignore_dirs):
+                continue
+            if active is not None and active.match(rel, is_dir=True):
+                continue
+            kept.append(name)
+        dirnames[:] = kept
+
+        for name in filenames:
+            rel = name if d_rel == "" else f"{d_rel}/{name}"
+            if is_ignored(rel, globs, ignore_dirs):
+                continue
+            if active is not None and active.match(rel, is_dir=False):
+                continue
+            yield d_abs / name, rel