GitHub - senna-lang/Codeatrium: Minimal local memory for AI coding agents — two commands, recall everything.

English · 日本語

An AI coding agent recalls everything it has done through just two commands: loci search and loci context. That's the whole interface. The agent reaches for the right call without hesitation, and restores past decisions, conversations, and exact code locations in under 0.2 seconds.

The CLI command loci is designed to be called by the agent itself — running loci search "..." --json from within a prompt. (The name comes from the Method of Loci — the memory-palace technique. Under the hood, conversations are distilled into "palace objects"; see How It Works. The architecture extends the conversational memory model from arXiv:2603.13017 for coding agents.)

Note: Currently Claude Code only — indexing reads Claude Code's session logs (.jsonl). Distillation defaults to claude --print (Haiku) but can run on a local OpenAI-compatible LLM instead (see Configuration).

Minimal Interface

The whole recall interface is two commands:

loci search "query" — semantic search over past conversations
loci context — reverse lookup, by code symbol (--symbol "name") or git branch (--branch "name")
- tree-sitter symbol resolution (Python / TypeScript / Go) lets agents understand implementation intent before editing
- --branch "name" recalls what was done and discussed on a specific git branch (also available as loci search "query" --branch "name")

That's deliberate. The user here is the agent, and an agent handed a 50-tool palette hesitates, mis-picks, and burns tokens just deciding which to call. With a surface this small — and no MCP tool schemas sitting resident in the context window — the agent reaches for the right call the first time, every time. (When the full transcript is needed, loci show "<ref>" expands any result to its verbatim source.)

Touching a symbol means recalling what was decided about it — loci context reverse-looks-up the exact code location, signature, and the conversation behind it:

How It Works

Index — Splits agent session logs into exchanges (user utterance + agent response pairs) and indexes them with FTS5 for keyword search
Distill — An LLM (claude --print, default claude-haiku-4-5) summarizes each exchange into a palace object: exchange_core (what was done), specific_context (concrete details), room_assignments (topic tags). tree-sitter resolves touched files to symbol level (function/class/method + file + line + signature)
Search — Cross-layer search fusing BM25 on verbatim text with HNSW on distilled embeddings via RRF

Raw conversations are not embedded — only the condensed distilled text is embedded with multilingual-e5-small (384-dim), balancing semantic search quality with embedding cost. The embedding model runs as a Unix socket server, keeping search latency under 0.2 seconds after the first load.

Installation

pipx install codeatrium

Requires Python 3.11+.

Quick Start

# Initialize in project root (also registers Claude Code hooks)
loci init

loci init now handles everything in one step — database setup, existing session detection, and Claude Code hook registration. Pass --no-hooks to skip hook registration. If init fails partway through, .codeatrium/ is cleaned up automatically so re-running is safe.

When running loci init, if past session logs are detected, you'll be prompted with:

Important

When adopting this tool mid-project, a large number of exchanges may already exist. Distilling all of them will consume significant claude --print (Haiku) tokens. We recommend starting with Skip all or Distill last 50.

Min chars threshold — Minimum character filter applied at index time (default: 50). Shorter exchanges are skipped entirely, which also shrinks the pool of distillation candidates. Higher values exclude short conversations and reduce token usage; lower values include nearly everything. (Distillation applies a separate min_chars of 100 — see Configuration.)
Handling existing exchanges — Choose how much past history to distill:
- Skip all (no past session distillation)
- Distill last 50 (recent history only)
- Distill all (everything — high token cost)
- Custom (specify a number)
Run distillation now? — Accepts 1/2/y/n/yes/no. Choose No to defer to the next session start.

Invalid input on any prompt re-prompts instead of silently falling back to a default.

Agent Instructions

Agent instructions are injected automatically — no manual setup required:

loci init — Inserts a marker section (...) into CLAUDE.md
loci prime — Dynamically injects command usage into the context window at every session start via SessionStart Hook

CLI Commands

Command	Description
`loci init`	Initialize `.codeatrium/` and register Claude Code hooks (`--no-hooks` to skip)
`loci index`	Index new session logs
`loci distill [--limit N]`	Distill undistilled exchanges via LLM
`loci search "query" --json`	Semantic search (agent-facing); add `--branch NAME` to filter by git branch
`loci context --symbol "name" --json`	Code symbol → past conversations (lightweight; add `--full` for verbatim text)
`loci context --branch "name" --json`	Git branch → past conversations (includes undistilled exchanges)
`loci show "<ref>" --json`	Retrieve verbatim conversation
`loci status`	Show index state
`loci prime`	Inject command usage into the session context (run automatically by the SessionStart hook)
`loci server start/stop/status`	Embedding server management
`loci hook install`	Re-register hooks (normally already done by `loci init`)
`loci hook uninstall`	Remove codeatrium hooks from `settings.json`

Automation (Claude Code Hooks)

After loci init (or loci hook install), everything runs automatically:

Hook	Trigger	Command
Stop (async)	After every turn	`loci index`
SessionStart	startup / `/clear` / `/resume` / `compact`	`loci prime`
SessionStart	startup / `/clear` / `/resume` / `compact`	`loci server start`
SessionStart	startup / `/clear` / `/resume` / `compact`	`loci distill`

loci index — Runs asynchronously after every turn. Indexes only new exchanges, so it's fast even mid-session
loci distill — Distills undistilled exchanges at session start. Defaults to claude --print (Haiku, through the user's Claude Code); can use a local OpenAI-compatible LLM instead (see Configuration)
loci server start — Keeps the embedding model (~500MB) resident in memory for sub-0.2s search latency

Search Output

[
  {
    "exchange_core": "Added connection pool with pool_size=5",
    "specific_context": "pool_size=5, max_overflow=10",
    "rooms": [
      { "room_type": "concept", "room_key": "db-pool", "room_label": "DB connection pooling" }
    ],
    "symbols": [
      { "name": "create_pool", "file": "src/db.py", "line": 42, "signature": "def create_pool(...)" }
    ],
    "verbatim_ref": "~/.claude/projects/.../session.jsonl:ply=42",
    "git_branch": "feature/db-pool"
  }
]

Configuration

.codeatrium/config.toml (generated by loci init):

[distill]
provider = "claude"                    # Distillation backend: "claude" | "openai" (default "claude")
model = "claude-haiku-4-5"             # Model for distillation (default)
batch_limit = 20                       # Max distillations per hook run
min_chars = 100                        # Skip distillation for exchanges shorter than this

[index]
min_chars = 50                         # Skip indexing exchanges shorter than this

There are two min_chars settings: [index] min_chars controls what gets indexed at all, while [distill] min_chars further skips distillation (the LLM cost) for short exchanges that were already indexed.

Distilling with a local LLM

Distillation is a small per-exchange structured-extraction task, so a local model is usually good enough. Any OpenAI-compatible endpoint (Ollama, LM Studio, llama.cpp-server, vLLM) works by setting provider = "openai" and base_url — no new dependencies, no API key (the Authorization header is never sent, so this is local-only):

[distill]
provider = "openai"
model = "qwen2.5:7b"
base_url = "http://localhost:11434/v1"   # Ollama
# base_url = "http://localhost:1234/v1"  # LM Studio

base_url is required when provider = "openai"; if it is missing or empty the provider falls back to claude with a warning. With provider = "claude" (the default), base_url is ignored and distillation runs through claude --print as before.

Acknowledgments

The palace object model, room-based topic grouping, and BM25+HNSW fusion search are based on:

Structured Distillation for Personalized Agent Memory (arXiv:2603.13017)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
.github/workflows		.github/workflows
assets		assets
scripts		scripts
src/codeatrium		src/codeatrium
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.ja.md		README.ja.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minimal Interface

How It Works

Installation

Quick Start

Agent Instructions

CLI Commands

Automation (Claude Code Hooks)

Search Output

Configuration

Distilling with a local LLM

Acknowledgments

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Minimal Interface

How It Works

Installation

Quick Start

Agent Instructions

CLI Commands

Automation (Claude Code Hooks)

Search Output

Configuration

Distilling with a local LLM

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages