Skip to content

Add exact filesystem search and MCP server installation tools#52

Merged
Neverdecel merged 3 commits into
masterfrom
claude/hermes-filesystem-search-3ky2k0
Jun 18, 2026
Merged

Add exact filesystem search and MCP server installation tools#52
Neverdecel merged 3 commits into
masterfrom
claude/hermes-filesystem-search-3ky2k0

Conversation

@Neverdecel

Copy link
Copy Markdown
Owner

Summary

This PR adds two major features to CodeRAG:

  1. Exact filesystem search (search_files) — a regex/glob complement to semantic search that agents can use instead of shelling out to grep/rg/find
  2. One-command MCP server installation (coderag install) — automatically registers CodeRAG with Claude Code, Hermes, or Codex without hand-editing config files

Key Changes

New Modules

  • coderag/fs_search.py (351 lines): Exact regex/glob search over the workspace

    • Honors CodeRAG's ignore_globs so search sees exactly the same files as the indexer
    • Uses ripgrep for fast content scanning when available, falls back to pure-Python implementation
    • Supports multiple output modes: content (with context lines), files_only, count
    • Includes conservative secret redaction (API keys, tokens, private keys)
    • Pagination support for large result sets
  • coderag/install.py (298 lines): Interactive and CLI-driven MCP server installation

    • Supports three targets: Claude Code (.mcp.json), Hermes (~/.hermes/config.yaml), Codex (~/.codex/config.toml)
    • Idempotent: re-running never duplicates entries
    • Backs up existing config files before modification
    • Interactive wizard for customization (target selection, tool filtering, workspace scope)
    • Dry-run preview before applying changes
  • coderag/_ignore.py (35 lines): Shared ignore-glob matching

    • Extracted from indexer to ensure filesystem search and indexing agree on what files to process
    • Provides ignore_dir_names() for efficient directory pruning during walks
    • Provides is_ignored() for path matching against glob patterns

Modified Modules

  • coderag/surfaces/mcp_server.py:

    • Added search_files tool to the MCP server
    • Implemented loop detection guard (_LOOP_LIMIT = 4) to prevent agents from repeatedly issuing identical searches
    • Updated instructions to explain both search tools (semantic vs. exact)
    • Added pagination support to search_code with offset parameter
  • coderag/surfaces/cli.py:

    • New cmd_install command with --wizard, --print, --yes, and --scope flags
    • Auto-detects installed agents when no target specified
    • Previews changes before applying (dry-run first)
    • Provides next-steps guidance after installation
  • coderag/api.py:

    • Added search_files() method as a thin pass-through to fs_search.search_files
    • Wired to use the configured watched_dir and ignore_globs
  • coderag/indexer.py:

    • Refactored to use shared _ignore module instead of local ignore logic
  • README.md and AGENTS.md:

    • Updated documentation to describe the new search_files tool and install command

Tests

  • tests/test_fs_search.py (114 lines): Comprehensive tests for exact filesystem search

    • Content and file glob searches
    • Output modes (content, files_only, count)
    • Context lines, pagination, redaction
    • Binary file handling, invalid regex errors
    • Consistency check between ripgrep and Python implementations
  • tests/test_install.py (123 lines): Tests for MCP server installation

    • Per-target installation (Claude, Hermes, Codex)
    • Idempotency and config merging
    • Backup creation
    • Dry-run mode
    • Wizard interaction
  • tests/test_mcp.py: Updated to test new search_files tool and loop detection

Notable Implementation Details

  • Ignore consistency: Both indexing and filesystem search use the same _ignore

https://claude.ai/code/session_011tgKDQJ8p7YLEzoMz32moC

…tall`

Hermes-inspired filesystem-search improvements to CodeRAG's MCP surface:

- search_files: exact regex/glob search (ripgrep-backed, pure-Python fallback)
  as the literal-match complement to semantic search_code. Supports target
  content/files, output_mode content/files_only/count, context lines,
  pagination, and conservative secret redaction. Honours the same ignore rules
  as the indexer via a shared coderag/_ignore.py helper.
- Agent ergonomics on the MCP tools: offset pagination on search_code, a loop
  guard that blocks repeated identical searches, and get_file line numbers +
  "did you mean?" filename suggestions.
- coderag install: one-command registration of the MCP server into Claude Code
  (.mcp.json), Hermes (~/.hermes/config.yaml), and Codex (~/.codex/config.toml),
  with a sensible auto-detect default and an interactive wizard. Idempotent,
  with .bak backups and a --print dry-run.
- Docs (README, AGENTS.md) and tests (test_fs_search, test_install, test_mcp).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011tgKDQJ8p7YLEzoMz32moC
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 78.43137% with 99 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
coderag/surfaces/cli.py 21.15% 41 Missing ⚠️
coderag/fs_search.py 77.21% 36 Missing ⚠️
coderag/install.py 88.70% 20 Missing ⚠️
coderag/_ignore.py 90.00% 1 Missing ⚠️
coderag/surfaces/mcp_server.py 97.77% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

claude added 2 commits June 18, 2026 12:40
The redaction test wrote a fake `token = "..."` literal that gitleaks' generic-api-key
rule flagged as a leak, failing the secret-scan check on PR #52. Use a low-entropy
placeholder and a `# gitleaks:allow` marker; the redaction assertion is unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011tgKDQJ8p7YLEzoMz32moC
The redaction test feeds a dummy `token = "..."` line to verify masking; gitleaks'
generic-api-key rule flagged it and failed the secret-scan on PR #52. Because gitleaks
scans per-commit diffs, the literal lives in the PR's first commit even after the test
was tidied — so suppress it with a narrow repo .gitleaks.toml allowlist (default rules
kept) rather than rewriting history.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011tgKDQJ8p7YLEzoMz32moC
@Neverdecel Neverdecel merged commit dbee8ab into master Jun 18, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants