fix(weaviate): tag near_*/hybrid/bm25 spans as RETRIEVER + populate input/output#175
Open
SuhaniNagpal7 wants to merge 1 commit into
Open
fix(weaviate): tag near_*/hybrid/bm25 spans as RETRIEVER + populate input/output#175SuhaniNagpal7 wants to merge 1 commit into
SuhaniNagpal7 wants to merge 1 commit into
Conversation
… input/output
The Weaviate instrumentor emits spans for the four retrieval methods
(NearVector, NearText, Hybrid, Bm25) but never sets the FI canonical
retriever keys. Future AGI dashboard shows Type=unknown with empty
Input/Output panels.
Changes (all in traceai_weaviate/_wrappers.py)
- Optional `fi_instrumentation.fi_types` import with raw-string fallback.
- New `_weaviate_objects_summary(objects)` helper — builds a JSON
[{uuid, properties, score?}, ...] summary from the first 50 weaviate
result objects.
- `NearVectorWrapper.__call__`:
- Set `gen_ai.span.kind = "RETRIEVER"`.
- Set `input.value` from a JSON summary {limit, certainty, distance,
vector_dim}.
- Set `output.value` from the objects summary.
- `NearTextWrapper.__call__`:
- Same kind tag.
- Set `input.value` directly to the user's `query` string (text/plain
mime, since the dashboard renders text inputs nicer than JSON).
- Output via the same summary helper.
- `HybridWrapper.__call__`:
- Same shape as NearText: query text → input.value, objects → output.
- `Bm25Wrapper.__call__`:
- Same shape as NearText.
FetchObjects is untouched (it's a list/paginate operation, not a
similarity retrieval). Insert/InsertMany/DeleteById/DeleteMany are
untouched — not retrievals. All pre-existing `db.vector.*` attrs
preserved (additive change).
Verified end-to-end via Future AGI MCP. All four spans now show
Type=Retriever in the dashboard with populated Input/Output panels.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WeaviateInstrumentoremits spans for its four retrieval methods —NearVector,NearText,Hybrid,Bm25— but none of them set the FI canonical retriever keys. Future AGI dashboard renders all four as Type: unknown with empty Input/Output panels.This PR adds three FI canonical attributes on all four retrieval wrappers.
FetchObjects/Insert/InsertMany/DeleteById/DeleteManyare untouched — none are similarity retrievals.What changes
All in
traceai_weaviate/_wrappers.py:fi_instrumentation.fi_typesimport with raw-string fallback._weaviate_objects_summary(objects)helper — builds a JSON[{uuid, properties, score?}, ...]summary from the first 50 result objects.NearVectorWrapper.__call__—kind=RETRIEVER,input.valuefrom{limit, certainty, distance, vector_dim}JSON,output.valuefrom objects summary.NearTextWrapper.__call__—kind=RETRIEVER,input.valuedirectly from the user'squerystring (text/plainmime — dashboard renders text inputs more cleanly than JSON),output.valuefrom the same summary helper.HybridWrapper.__call__— same shape as NearText.Bm25Wrapper.__call__— same shape as NearText.All pre-existing
db.vector.*attrs preserved.Verified
SimpleNamespace(objects=[...])results.weaviate near_text— Type=Retriever, Input=\"where is the Eiffel Tower\", Output=[{uuid, properties:{text:\"Eiffel Tower is in Paris.\"}, score:0.9}, ...]weaviate near_vector— Type=Retriever, Input={\"limit\": 2, \"vector_dim\": 3}, Output=objects summaryweaviate hybrid— Type=Retriever, Input=\"eiffel tower paris\", Output=objects summaryweaviate bm25— Type=Retriever, Input=\"eiffel tower paris\", Output=objects summaryOut of scope
Per-document
retrieval.documents.N.*attrs (Tier 3). Other 6 vector DBs get their own PRs.