Skip to content

fix(weaviate): tag near_*/hybrid/bm25 spans as RETRIEVER + populate input/output#175

Open
SuhaniNagpal7 wants to merge 1 commit into
devfrom
fix/weaviate-retriever-attrs
Open

fix(weaviate): tag near_*/hybrid/bm25 spans as RETRIEVER + populate input/output#175
SuhaniNagpal7 wants to merge 1 commit into
devfrom
fix/weaviate-retriever-attrs

Conversation

@SuhaniNagpal7
Copy link
Copy Markdown
Contributor

Summary

WeaviateInstrumentor emits spans for its four retrieval methods — NearVector, NearText, Hybrid, Bm25 — but none of them set the FI canonical retriever keys. Future AGI dashboard renders all four as Type: unknown with empty Input/Output panels.

This PR adds three FI canonical attributes on all four retrieval wrappers. FetchObjects/Insert/InsertMany/DeleteById/DeleteMany are untouched — none are similarity retrievals.

What changes

All in traceai_weaviate/_wrappers.py:

  • Optional fi_instrumentation.fi_types import with raw-string fallback.
  • New _weaviate_objects_summary(objects) helper — builds a JSON [{uuid, properties, score?}, ...] summary from the first 50 result objects.
  • NearVectorWrapper.__call__kind=RETRIEVER, input.value from {limit, certainty, distance, vector_dim} JSON, output.value from objects summary.
  • NearTextWrapper.__call__kind=RETRIEVER, input.value directly from the user's query string (text/plain mime — dashboard renders text inputs more cleanly than JSON), output.value from the same summary helper.
  • HybridWrapper.__call__ — same shape as NearText.
  • Bm25Wrapper.__call__ — same shape as NearText.

All pre-existing db.vector.* attrs preserved.

Verified

  • In-process attribute check via direct wrapper invocation with SimpleNamespace(objects=[...]) results.
  • Real end-to-end ingest to Future AGI → confirmed via Future AGI MCP that all four spans show:
    • weaviate near_text — Type=Retriever, Input=\"where is the Eiffel Tower\", Output=[{uuid, properties:{text:\"Eiffel Tower is in Paris.\"}, score:0.9}, ...]
    • weaviate near_vector — Type=Retriever, Input={\"limit\": 2, \"vector_dim\": 3}, Output=objects summary
    • weaviate hybrid — Type=Retriever, Input=\"eiffel tower paris\", Output=objects summary
    • weaviate bm25 — Type=Retriever, Input=\"eiffel tower paris\", Output=objects summary

Out of scope

Per-document retrieval.documents.N.* attrs (Tier 3). Other 6 vector DBs get their own PRs.

… input/output

The Weaviate instrumentor emits spans for the four retrieval methods
(NearVector, NearText, Hybrid, Bm25) but never sets the FI canonical
retriever keys. Future AGI dashboard shows Type=unknown with empty
Input/Output panels.

Changes (all in traceai_weaviate/_wrappers.py)

- Optional `fi_instrumentation.fi_types` import with raw-string fallback.
- New `_weaviate_objects_summary(objects)` helper — builds a JSON
  [{uuid, properties, score?}, ...] summary from the first 50 weaviate
  result objects.
- `NearVectorWrapper.__call__`:
  - Set `gen_ai.span.kind = "RETRIEVER"`.
  - Set `input.value` from a JSON summary {limit, certainty, distance,
    vector_dim}.
  - Set `output.value` from the objects summary.
- `NearTextWrapper.__call__`:
  - Same kind tag.
  - Set `input.value` directly to the user's `query` string (text/plain
    mime, since the dashboard renders text inputs nicer than JSON).
  - Output via the same summary helper.
- `HybridWrapper.__call__`:
  - Same shape as NearText: query text → input.value, objects → output.
- `Bm25Wrapper.__call__`:
  - Same shape as NearText.

FetchObjects is untouched (it's a list/paginate operation, not a
similarity retrieval). Insert/InsertMany/DeleteById/DeleteMany are
untouched — not retrievals. All pre-existing `db.vector.*` attrs
preserved (additive change).

Verified end-to-end via Future AGI MCP. All four spans now show
Type=Retriever in the dashboard with populated Input/Output panels.
@SuhaniNagpal7 SuhaniNagpal7 self-assigned this May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant