Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,10 +273,13 @@ provider-backed ELF evidence was required.
personalization, local `get_all` export-style readback, and deletion audit history.
The separate OpenMemory export-helper setup probe in `live-baseline-20260611122416`
records `blocked` with `DOCKER_UNAVAILABLE_IN_BASELINE_RUNNER`, so SDK `get_all`
is still not UI/export evidence. The comparison records ELF as a loss on preference
correction history, ties on scoped personalization and delete audit, `not_tested`
for local SDK export-style parity, `blocked` for OpenMemory UI/export, and
`non_goal` for hosted Platform export and optional graph memory in the local OSS
is still not UI/export evidence. OpenMemory UI/export product recheck after XY-987
refreshed that blocker in `live-baseline-20260619065543`; product browser/dashboard
readback is still not reached because the export helper needs Docker access to a
running OpenMemory product container. The comparison records ELF as a loss on
preference correction history, ties on scoped personalization and delete audit,
`not_tested` for local SDK export-style parity, `blocked` for OpenMemory UI/export,
and `non_goal` for hosted Platform export and optional graph memory in the local OSS
lane.
- Capture/write-policy live follow-up after XY-933: ELF now passes 4/4 live
`capture_integration` jobs with zero redaction leaks, source ids preserved in
Expand Down Expand Up @@ -318,6 +321,7 @@ Detailed evidence and interpretation:
- [qmd Debug-Ergonomics Dreaming Retest Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-qmd-debug-ergonomics-dreaming-retest-report.md)
- [OpenViking Trajectory Materialization Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-openviking-trajectory-materialization-report.md)
- [Service-Native Dreaming Readback Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-service-native-dreaming-readback-report.md)
- [OpenMemory UI/Export Product Readback Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-openmemory-ui-export-product-readback-report.md)
- [Live Baseline Benchmark Runbook](docs/runbook/benchmarking/live_baseline_benchmark.md)
- [Single-User Production Runbook](docs/runbook/single_user_production.md)
- Benchmark contract:
Expand Down Expand Up @@ -403,6 +407,7 @@ Detailed comparison, mechanism-level analysis, and source map:
- [Scheduled Memory Task Scoring Report - June 16, 2026](docs/evidence/benchmarking/2026-06-16-scheduled-memory-task-scoring-report.md)
- [Dreaming Competitor-Strength Retest Report - June 17, 2026](docs/evidence/benchmarking/2026-06-17-dreaming-competitor-strength-retest-report.md)
- [qmd Debug-Ergonomics Dreaming Retest Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-qmd-debug-ergonomics-dreaming-retest-report.md)
- [OpenMemory UI/Export Product Readback Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-openmemory-ui-export-product-readback-report.md)
- [Live Baseline Benchmark Runbook](docs/runbook/benchmarking/live_baseline_benchmark.md)
- [Real-World Agent Memory Benchmark](docs/runbook/benchmarking/real_world_agent_memory_benchmark.md)
- [External Memory Improvement Plan](docs/evidence/external_memory/external_memory_improvement_plan.md)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
{
"schema": "elf.openmemory_ui_export_product_recheck_report/v1",
"report_id": "xy-987-openmemory-ui-export-product-readback-2026-06-19",
"authority": "XY-987",
"created_at": "2026-06-19T06:56:58Z",
"goal": "Recheck OpenMemory UI/export readback with a product-level local runner or publish a fresh typed setup blocker with concrete evidence.",
"command": {
"command": "cargo make openmemory-ui-export-readback",
"status": "pass",
"runtime_seconds": 78.02,
"report_artifact": "tmp/live-baseline/live-baseline-report.json",
"probe_artifact": "tmp/live-baseline/mem0-openmemory-ui-export.json",
"attempt_log": "tmp/live-baseline/mem0-openmemory-export-attempt.log"
},
"source_baseline": {
"previous_report": "docs/evidence/benchmarking/2026-06-11-mem0-openmemory-history-ui-export-report.md",
"previous_snapshot": "apps/elf-eval/fixtures/report_snapshots/2026-06-11-xy-931-openmemory-ui-export-readback.json",
"previous_status": "blocked",
"previous_reason_code": "DOCKER_UNAVAILABLE_IN_BASELINE_RUNNER"
},
"run": {
"run_id": "live-baseline-20260619065543",
"project_filter": "mem0",
"sdk_baseline_status": "pass",
"sdk_check_summary": {
"total": 8,
"pass": 8,
"fail": 0,
"wrong_result": 0,
"blocked": 0
},
"ui_export_status": "blocked",
"ui_export_reason_code": "DOCKER_UNAVAILABLE_IN_BASELINE_RUNNER"
},
"same_corpus_boundary": {
"sdk_result_artifact": "tmp/live-baseline/mem0-search.json",
"sdk_get_all_check_status": "pass",
"sdk_get_all_is_ui_export_evidence": false,
"openmemory_ui_export_is_separate_product_ux_scenario": true
},
"openmemory_product_surface": {
"tree_present": true,
"ui_package_present": true,
"compose_file_present": true,
"export_script_present": true,
"sunsetting_notice_present": true,
"requires_openai_api_key": true,
"requires_docker_compose": true,
"export_requires_running_container": true,
"default_export_container": "openmemory-openmemory-mcp-1"
},
"openmemory_probe": {
"attempt": {
"command": "timeout 30 bash openmemory/backup-scripts/export_openmemory.sh --user-id elf-history-user --container openmemory-openmemory-mcp-1",
"exit_code": 1,
"log_artifact": "tmp/live-baseline/mem0-openmemory-export-attempt.log",
"output_excerpt": "openmemory/backup-scripts/export_openmemory.sh: line 52: docker: command not found\nERROR: Container 'openmemory-openmemory-mcp-1' not found/running. Pass --container <NAME_OR_ID> if different."
},
"export_validation": {}
},
"classification": {
"status": "blocked",
"comparison_judgment": "unchanged",
"reason_code": "DOCKER_UNAVAILABLE_IN_BASELINE_RUNNER",
"reason": "The OpenMemory export helper requires Docker access to a running OpenMemory product container, but Docker is not available inside the baseline-runner container; browser/dashboard readback is not reached.",
"next_action": "Add a dedicated OpenMemory Docker Compose profile that imports the generated mem0 corpus into the OpenMemory app database, starts the API/UI with explicit local or provider configuration, then rerun the export helper and validate exported memories."
},
"improvement_regression_readback": {
"judgment": "unchanged",
"improved": [
"The OpenMemory UI/export blocker has a fresh June 19 command run, JSON artifact, and attempt log."
],
"unchanged": [
"mem0 local OSS SDK history and get_all readback remain pass-only SDK evidence.",
"OpenMemory product UI/export readback remains blocked before same-corpus product app database validation.",
"No ELF win, tie, or loss is allowed for OpenMemory UI/export."
],
"regressed": []
},
"claim_boundary": {
"elf_can_compare_against_openmemory_ui_export_after_this_run": false,
"hosted_platform_claim": false,
"optional_graph_memory_enabled": false,
"sdk_get_all_is_ui_export_evidence": false,
"product_browser_or_dashboard_readback_reached": false
},
"next_optimization_direction": {
"required_fields": [
"dedicated_openmemory_compose_profile",
"same_corpus_import_into_openmemory_app_database",
"openmemory_api_or_ui_readback_artifact",
"export_zip_validation_against_elf_history_user",
"explicit_provider_or_local_model_configuration",
"separate_sdk_get_all_and_product_export_scorers"
],
"non_goal": "Do not use hosted mem0 Platform export or private operator data in this local OSS lane."
}
}
92 changes: 92 additions & 0 deletions apps/elf-eval/tests/real_world_job_benchmark.rs
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,10 @@ fn service_native_dreaming_readback_materialization_json_path() -> Result<PathBu
report_snapshot_path("2026-06-19-service-native-dreaming-readback-materialization.json")
}

fn openmemory_ui_export_product_readback_report_json_path() -> Result<PathBuf> {
report_snapshot_path("2026-06-19-openmemory-ui-export-product-readback-report.json")
}

fn openviking_trajectory_materialization_report_markdown_path() -> Result<PathBuf> {
Ok(workspace_root()?
.join("docs")
Expand All @@ -270,6 +274,14 @@ fn service_native_dreaming_readback_report_markdown_path() -> Result<PathBuf> {
.join("2026-06-19-service-native-dreaming-readback-report.md"))
}

fn openmemory_ui_export_product_readback_report_markdown_path() -> Result<PathBuf> {
Ok(workspace_root()?
.join("docs")
.join("evidence")
.join("benchmarking")
.join("2026-06-19-openmemory-ui-export-product-readback-report.md"))
}

fn live_temporal_reconciliation_report_json_path() -> Result<PathBuf> {
report_snapshot_path("2026-06-16-live-temporal-reconciliation-report.json")
}
Expand Down Expand Up @@ -3413,6 +3425,86 @@ fn assert_service_native_dreaming_docs(markdown: &str, benchmarking_index: &str,
assert!(readme.contains("real-world-memory-service-native-dreaming"));
}

#[test]
fn openmemory_ui_export_product_recheck_preserves_blocked_boundary() -> Result<()> {
let report = serde_json::from_str::<Value>(&fs::read_to_string(
openmemory_ui_export_product_readback_report_json_path()?,
)?)?;
let markdown =
fs::read_to_string(openmemory_ui_export_product_readback_report_markdown_path()?)?;
let benchmarking_index = fs::read_to_string(benchmarking_index_path()?)?;
let readme = fs::read_to_string(readme_path()?)?;

assert_eq!(
report.pointer("/schema").and_then(Value::as_str),
Some("elf.openmemory_ui_export_product_recheck_report/v1")
);
assert_eq!(report.pointer("/authority").and_then(Value::as_str), Some("XY-987"));
assert_eq!(
report.pointer("/command/command").and_then(Value::as_str),
Some("cargo make openmemory-ui-export-readback")
);
assert_eq!(report.pointer("/command/status").and_then(Value::as_str), Some("pass"));
assert_eq!(
report.pointer("/command/probe_artifact").and_then(Value::as_str),
Some("tmp/live-baseline/mem0-openmemory-ui-export.json")
);
assert_eq!(report.pointer("/run/sdk_check_summary/pass").and_then(Value::as_u64), Some(8));
assert_eq!(report.pointer("/run/ui_export_status").and_then(Value::as_str), Some("blocked"));
assert_eq!(
report.pointer("/run/ui_export_reason_code").and_then(Value::as_str),
Some("DOCKER_UNAVAILABLE_IN_BASELINE_RUNNER")
);
assert_eq!(
report
.pointer("/same_corpus_boundary/sdk_get_all_is_ui_export_evidence")
.and_then(Value::as_bool),
Some(false)
);
assert_eq!(
report
.pointer("/openmemory_product_surface/export_requires_running_container")
.and_then(Value::as_bool),
Some(true)
);
assert!(
report
.pointer("/openmemory_probe/attempt/output_excerpt")
.and_then(Value::as_str)
.is_some_and(|excerpt| excerpt.contains("docker: command not found")
&& excerpt.contains("Container 'openmemory-openmemory-mcp-1' not found/running"))
);
assert_eq!(
report.pointer("/classification/comparison_judgment").and_then(Value::as_str),
Some("unchanged")
);
assert_eq!(
report
.pointer("/claim_boundary/product_browser_or_dashboard_readback_reached")
.and_then(Value::as_bool),
Some(false)
);
assert!(array_contains_str(
&report,
"/improvement_regression_readback/unchanged",
"OpenMemory product UI/export readback remains blocked before same-corpus product app database validation."
)?);
assert!(array_contains_str(
&report,
"/next_optimization_direction/required_fields",
"same_corpus_import_into_openmemory_app_database"
)?);
assert!(markdown.contains("OpenMemory UI/export product-readback status is unchanged"));
assert!(markdown.contains("Product browser/dashboard readback reached"));
assert!(
benchmarking_index.contains("2026-06-19-openmemory-ui-export-product-readback-report.md")
);
assert!(readme.contains("OpenMemory UI/Export Product Readback Report - June 19, 2026"));
assert!(readme.contains("OpenMemory UI/export product recheck after XY-987"));

Ok(())
}

fn assert_openviking_trajectory_materialization_summary(report: &Value) -> Result<()> {
assert_eq!(
report.pointer("/schema").and_then(Value::as_str),
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
type: Evidence
title: "OpenMemory UI/Export Product Readback Report - June 19, 2026"
description: "Checked-in benchmark evidence record: OpenMemory UI/Export Product Readback Report - June 19, 2026."
resource: docs/evidence/benchmarking/2026-06-19-openmemory-ui-export-product-readback-report.md
status: active
authority: current_state
owner: evidence
last_verified: 2026-06-19
tags:
- docs
- evidence
- benchmarking
---
# OpenMemory UI/Export Product Readback Report - June 19, 2026

Goal: Recheck OpenMemory UI/export readback after the earlier setup blocker and
publish a fresh typed product-readback boundary if a local product runner still
cannot validate same-corpus OpenMemory export.
Read this when: You need to know whether XY-987 removed the OpenMemory UI/export
blocker, whether mem0 SDK `get_all` can be used as UI/export evidence, or what setup
work remains before an ELF/OpenMemory product-UX comparison is allowed.
Inputs:
`apps/elf-eval/fixtures/report_snapshots/2026-06-19-openmemory-ui-export-product-readback-report.json`,
`tmp/live-baseline/mem0-openmemory-ui-export.json`,
`tmp/live-baseline/mem0-openmemory-export-attempt.log`,
and `docs/evidence/benchmarking/2026-06-11-mem0-openmemory-history-ui-export-report.md`.
Outputs: A fresh command run, a JSON companion, an attempt-log artifact path, and a
scenario-level improved/unchanged/blocked judgment.

## Executive Judgment

The OpenMemory UI/export product-readback status is unchanged: still blocked.

`cargo make openmemory-ui-export-readback` completed successfully as a benchmark
command and refreshed the mem0 local OSS SDK baseline:

- mem0 SDK checks: 8 pass, 0 fail.
- SDK `get_all` export-style readback: pass.
- OpenMemory UI/export product readback: blocked.
- Reason code: `DOCKER_UNAVAILABLE_IN_BASELINE_RUNNER`.
- Fresh run id: `live-baseline-20260619065543`.

This improves freshness and auditability, not competitive status. The OpenMemory
product tree, UI package, compose file, and export helper are present, but the export
helper requires Docker access to a running OpenMemory product container from inside
the baseline runner. The attempt still fails before browser/dashboard readback or
same-corpus product app database validation is reached.

## Command Evidence

| Command | Result | Runtime | Artifact |
| --- | --- | ---: | --- |
| `cargo make openmemory-ui-export-readback` | command pass; OpenMemory probe `blocked` | 78.02 seconds | `tmp/live-baseline/live-baseline-report.json`, `tmp/live-baseline/mem0-openmemory-ui-export.json`, `tmp/live-baseline/mem0-openmemory-export-attempt.log` |

The probe command was:

`timeout 30 bash openmemory/backup-scripts/export_openmemory.sh --user-id elf-history-user --container openmemory-openmemory-mcp-1`

The attempt log records:

```text
openmemory/backup-scripts/export_openmemory.sh: line 52: docker: command not found
ERROR: Container 'openmemory-openmemory-mcp-1' not found/running. Pass --container <NAME_OR_ID> if different.
```

## Product Surface Readback

| Surface | Status |
| --- | --- |
| OpenMemory tree present | `true` |
| UI package present | `true` |
| Compose file present | `true` |
| Export helper present | `true` |
| Sunsetting notice present | `true` |
| Requires OpenAI API key path | `true` |
| Requires Docker Compose | `true` |
| Export helper requires running container | `true` |
| Product browser/dashboard readback reached | `false` |

## Improvement/Regression Readback

- Improved: there is now a fresh June 19 command run, JSON companion, and attempt log
for the OpenMemory product-readback blocker.
- Unchanged: OpenMemory UI/export remains blocked before same-corpus product app
database validation.
- Unchanged: mem0 local OSS SDK history and local `get_all` readback remain separate
passing evidence. They are not UI/export product evidence.
- No regression: the command still preserves the SDK/product boundary and does not
convert a setup blocker into an ELF win or loss.

## Claim Boundaries

Allowed:

- mem0 local OSS SDK checks and SDK `get_all` readback pass in the fresh run.
- OpenMemory UI/export product readback remains blocked with a concrete command,
artifact path, and setup error.
- The June 19 recheck is unchanged versus the June 11 XY-931 setup blocker except
for freshness and checked-in evidence.

Not allowed:

- Do not claim ELF can compare against OpenMemory UI/export after this run.
- Do not claim OpenMemory product UI/export pass from SDK-only `get_all` evidence.
- Do not claim hosted mem0 Platform behavior.
- Do not use this blocker as an ELF win or OpenMemory loss.

## Next Optimization Direction

The next fair product-readback attempt needs a dedicated OpenMemory Docker Compose
profile that imports the generated mem0 corpus into the OpenMemory app database,
starts API/UI with explicit local or provider configuration, and validates exported
memories against `elf-history-user`.

Required fields before the blocker can move:

- dedicated OpenMemory compose profile,
- same-corpus import into the OpenMemory app database,
- OpenMemory API or UI readback artifact,
- export zip validation against the benchmark-owned user,
- explicit provider or local model configuration,
- separate SDK `get_all` and product export scorers.
1 change: 1 addition & 0 deletions docs/evidence/benchmarking/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Routes to: Benchmarking evidence concepts under `docs/evidence/benchmarking/`.
- `2026-06-16-scheduled-memory-task-scoring-report.md`: Real-World Job Benchmark Report.
- `2026-06-17-dreaming-competitor-strength-retest-report.md`: Dreaming Competitor-Strength Retest Report - June 17, 2026.
- `2026-06-19-letta-core-archive-export-readback-report.md`: Letta Core/Archive Export-Readback Report - June 19, 2026; adds a Docker-contained Letta materialization/report command while preserving all six core/archive comparison scenarios as typed blockers until exported core block JSON, archival readback/search JSON, and source ids exist.
- `2026-06-19-openmemory-ui-export-product-readback-report.md`: OpenMemory UI/Export Product Readback Report - June 19, 2026; refreshes the product UI/export recheck and preserves the scenario as blocked because the export helper still needs Docker access to a running OpenMemory product container.
- `2026-06-19-openviking-trajectory-materialization-report.md`: OpenViking Trajectory Materialization Report - June 19, 2026; materializes the context-trajectory fixture slice through a dedicated repo task while preserving staged retrieval, hierarchy selection, and recursive/context expansion as typed blockers.
- `2026-06-19-qmd-debug-ergonomics-dreaming-retest-report.md`: qmd Debug-Ergonomics Dreaming Retest Report - June 19, 2026; confirms qmd's default top-k/replay edge is unchanged while ELF keeps the narrow operator-debug trace/stage visibility wins.
- `2026-06-19-service-native-dreaming-readback-report.md`: Service-Native Dreaming Readback Report - June 19, 2026; materializes memory summary, proactive brief, and scheduled-memory derived outputs through `ElfService` readback with 9 pass, 0 wrong_result, and 2 typed XY-930 blockers.
3 changes: 3 additions & 0 deletions docs/log.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,6 @@ logs.
`cargo make real-world-memory-service-native-dreaming`, proving public/local
memory summary, proactive brief, and scheduled-memory artifacts can be materialized
through `ElfService` readback while preserving XY-930 private/provider blockers.
- Added the OpenMemory UI/export product readback recheck report and snapshot for
XY-987, preserving the product UI/export scenario as blocked while keeping mem0 SDK
`get_all` evidence separate from OpenMemory product evidence.