Skip to content

Add Agent Knowledge OS closeout benchmark#223

Merged
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-1023
Jun 19, 2026
Merged

Add Agent Knowledge OS closeout benchmark#223
yvette-carlisle merged 1 commit into
mainfrom
y/elf-xy-1023

Conversation

@yvette-carlisle

Copy link
Copy Markdown
Member

Summary

  • Add the XY-1023 Agent Knowledge OS closeout benchmark report and JSON snapshot covering 19 products across six scenarios.
  • Preserve competitor-specific strengths, including VectifyAI PageIndex/OpenKB, as optimization inputs instead of broad pass claims.
  • Wire the report into README and docs/evidence/benchmarking index, with regression tests for key boundaries and docs wiring.

Validation

  • cargo make real-world-memory
  • python3 -m json.tool apps/elf-eval/fixtures/report_snapshots/2026-06-20-agent-knowledge-os-closeout-benchmark-report.json
  • cargo test -p elf-eval --test real_world_job_benchmark agent_knowledge_os_closeout_benchmark -- --nocapture
  • cargo test -p elf-eval --test real_world_job_benchmark -- --nocapture
  • cargo make test
  • cargo make fmt
  • cargo make check-docs
  • decodex docs check
  • git diff --check
  • cargo make check

@yvette-carlisle yvette-carlisle merged commit 42b9a1c into main Jun 19, 2026
12 checks passed
@yvette-carlisle yvette-carlisle deleted the y/elf-xy-1023 branch June 19, 2026 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant