feat(mcp): add HugeGraph MCP V1 stable tool surface#368
Conversation
This workflow will be triggered when a pull request is opened. It will then post a comment "@codecov-ai-reviewer review" to help with automated AI code reviews. It will use the `peter-evans/create-or-update-comment` action to create the comment. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yan Chao Mei <1653720237@qq.com> Co-authored-by: imbajin <jin@apache.org>
Co-authored-by: imbajin <jin@apache.org>
Updated configuration instructions and file paths for MCP.
Reformat 4 test files to pass Ruff Code Quality CI: - tests/test_error_handling.py - tests/test_execute_gremlin_read.py - tests/test_execute_gremlin_write.py - tests/test_execute_schema_operations.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add design_schema() function in schema_tools.py with best practices documentation - Add design_schema_tool() MCP tool wrapper in server.py - Update README.md with new feature description - Include usage guidelines: when to use, when not to use, and workflow
Catch graph-mcp up with the 11 latest commits on main (LLM fixes, REST graph-extract API, schema generator persistence, PDF RAG uploads, client 1.7.0 assertions, code-scan refactors). graph-mcp retains its own MCP work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Change-Id: I3ec5e62b4caa3df52bb808c7ed1beead785aff2b
Main changes: Add the hugegraph-mcp workspace member and package entrypoint. Expose the V1 stable MCP tools for graph inspection, Gremlin generation/read execution, graph data extraction/import/delete, schema design/validation/dry-run, and admin-gated debug operations. Add runtime readonly guards, conservative Gremlin read safety checks, unified response envelopes, and dry-run / plan_hash / confirm safety flow for write paths. Add HugeGraph MCP tests and CI workflow coverage. Add HugeGraph-related Codex skills under skills/. --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
imbajin
left a comment
There was a problem hiding this comment.
Reviewed the current MCP/client changes at head 8842bb53007102b47d50e4eaabb6e2cc51e9b526. I found several install/runtime safety issues that should be addressed before relying on the new MCP package independently. Local non-live checks passed, so these comments focus on behavioral and packaging gaps rather than test failures.
|
|
||
| dependencies = [ | ||
| "fastmcp>=2.2.0", | ||
| "hugegraph-python-client", |
There was a problem hiding this comment.
hugegraph-mcp depends on unconstrained hugegraph-python-client here. When this package is installed outside the current uv workspace, resolution falls back to the PyPI package hugegraph-python-client==0.1.1, not the 1.7.0 client in this PR. The MCP code depends on the 1.7.0 graphspace/auth-routing behavior, so an isolated install can pick an incompatible client.
Please constrain this dependency to the required version, or use the correct published package name, and add an isolated install/import test that does not rely on workspace sources. That test should verify the resolved client version and graphspace routing behavior.
| errors.append( | ||
| f"vertex {idx} property '{prop_name}' expects {data_type}, got {type(prop_value).__name__}" | ||
| ) | ||
| primary_keys = schema_primary_keys.get(label, []) |
There was a problem hiding this comment.
For CUSTOMIZE_STRING / CUSTOMIZE_NUMBER vertex labels, dry-run validation should require each vertex payload to include id. This branch only validates primary keys when primary_keys exists; if a custom-id label has no primary keys, a vertex missing id passes validate_graph_payload() and dry-run.
I reproduced this with an id_strategy=CUSTOMIZE_STRING person label with no primary keys: dry-run returned valid, but the generated write query was g.addV('person').property('name','Alice') without property(T.id, ...). The failure is delayed until confirmed execution, which means earlier batch writes can already have succeeded before the custom-id vertex fails. That breaks the expected dry-run/confirm safety chain.
| ) | ||
| continue | ||
|
|
||
| for field in REQUIRED_FIELDS[op_type]: |
There was a problem hiding this comment.
Schema validate/dry-run can currently return success for schema operations that HugeGraph will reject. The validation mostly checks required fields, duplicate names, and properties / fields references, but it does not validate legal create_property_key.data_type / cardinality values, nor whether vertex/edge primary_keys, nullable_keys, and parent/sub edge-label fields reference live or planned property/edge labels.
I reproduced two invalid inputs that returned valid: true: data_type="NOT_A_TYPE", and a vertex label with primary_keys=["missing"]. Please add these semantic checks plus negative tests, so apply_schema_tool(mode="validate"|"dry_run") does not return a confirmable plan for invalid schema changes.
| ) | ||
|
|
||
|
|
||
| @mcp.tool() |
There was a problem hiding this comment.
This raw Gremlin write tool bypasses the dry_run -> plan_hash -> confirm write-safety chain documented in the README. With HUGEGRAPH_MCP_ADMIN_MODE=true and HUGEGRAPH_MCP_READONLY=false, one MCP call can execute arbitrary write Gremlin, including destructive statements such as drops.
If this is an intentional debug escape hatch, please document clearly in the public safety contract and tests that it is outside the safety chain. Otherwise, it should require a per-call confirmation such as confirm=True, or be gated behind a separate debug-only environment flag, so the implementation does not conflict with the documented rule that user-reachable writes follow the safety chain.
| log_filename = f"{log_filename}.rank{rank}" | ||
|
|
||
| os.makedirs(os.path.dirname(log_filename), exist_ok=True) | ||
| try: |
There was a problem hiding this comment.
init_logger(log_output="client.log", stdout_logging=False) now leaves only the NullHandler. os.path.dirname("client.log") returns an empty string, so os.makedirs("") raises OSError; the exception is swallowed and the function returns before creating the RotatingFileHandler.
A plain filename in the current directory is a valid log target, so this should only call makedirs when the dirname is non-empty, then continue creating the file handler.
| } | ||
| if json is not None: | ||
| kwargs["json"] = json | ||
| if cfg.password: |
There was a problem hiding this comment.
When HugeGraph-AI login is enabled, thin_router routes use FastAPI HTTPBearer, but this MCP AI client only sends Basic auth when HUGEGRAPH_PASSWORD is present. There is no Bearer token configuration path.
Please add a token setting such as HUGEGRAPH_AI_TOKEN and send Authorization: Bearer ..., or explicitly document that MCP AI calls require HugeGraph-AI login to be disabled.
VGalaxies
left a comment
There was a problem hiding this comment.
Review summary
- Blocking: yes
- Summary: The PR still has blocking correctness and write-safety issues in the new MCP/Thin API surface.
- Evidence:
- static review of
git diff origin/main...HEAD git diff --check origin/main...HEADonly reports the known blank-line style issue
- static review of
| req.text, | ||
| req.example_prompt, | ||
| "property_graph", | ||
| req.language, |
There was a problem hiding this comment.
High: /graph-extract passes language as split_type
hugegraph-llm/src/hugegraph_llm/api/thin_api.py:106
Evidence
graph_extract_api()passesreq.languageas the fifth flow argument, butGraphExtractFlow.prepare()expectssplit_typein that position and rejects anything exceptdocument,paragraph, orsentence.
Impact
- Normal requests with
language="zh"or"en"fail before extraction runs, so the MCP graph extraction path returnsFLOW_EXECUTION_FAILED.
Requested fix
- Pass the default split type, e.g.
"document", beforereq.language, or call the flow with explicit keyword arguments; update the thin API test to assert the real flow contract.
| extra_context: dict[str, Any] = field(default_factory=dict) | ||
|
|
||
|
|
||
| def compute_plan_hash(context: PlanContext) -> str: |
There was a problem hiding this comment.
High: Confirm plan hashes are client-forgeable
hugegraph-mcp/hugegraph_mcp/plan_hash.py:55
Evidence
compute_plan_hash()is a plain public SHA-256 overPlanContext, andverify_plan_hash()recomputes it from caller-suppliednonceandexpires_at;manage_graph_data.py:221passes those submitted values directly.
Impact
- A caller with writes enabled can compute a valid hash and submit
confirm=Truewithout first receiving a server-issued dry-run token, bypassing the documented review/confirm safety chain and choosing an arbitrary future expiry.
Requested fix
- Make confirm tokens server-issued and unforgeable, for example with server-side one-time plan records or an HMAC using a server secret, and enforce a bounded TTL.
| ) | ||
|
|
||
|
|
||
| @thin_router.post("/graph-import", status_code=status.HTTP_200_OK, response_model=ThinAPIResponse) |
There was a problem hiding this comment.
High: Thin write endpoints bypass MCP write controls
hugegraph-llm/src/hugegraph_llm/api/thin_api.py:110
Evidence
/graph-importdirectly schedulesFlowName.IMPORT_GRAPH_DATA, and/vid-embeddings/refreshdirectly schedulesFlowName.UPDATE_VID_EMBEDDINGS; the new router is included inapp.pyunder auth that defaults to disabled.
Impact
- A default HugeGraph-LLM deployment exposes graph import and VID embedding mutation without MCP readonly/admin gating or the
dry_run -> plan_hash -> confirmcontrols.
Requested fix
- Remove these mutating routes from the public thin router, or require explicit authentication/admin authorization plus the same readonly and confirm/dry-run controls before scheduling mutating flows.
Summary
This PR adds
hugegraph-mcp, a FastMCP-based MCP server for HugeGraph. It is a safe thin adapter for MCP clients/agents to inspect HugeGraph, generate or execute read-only Gremlin, extract graph data from text, and run guarded graph data import/delete or schema preview workflows.Main Changes
hugegraph-mcpas a uv workspace member with CLI entrypoint, README/zh README, V1 docs, and MCP CI.inspect_graph_tool,generate_gremlin_tool,execute_gremlin_read_tool,extract_graph_data_tool,import_graph_data_tool,delete_graph_data_tool,design_schema_tool,apply_schema_tool.execute_gremlin_write_tool,refresh_vid_embeddings_tool.dry_run -> plan_hash -> confirmwrite protection.Test
Scope
V1 intentionally excludes GraphRAG QA, SQL/table import, graph data update, and real schema apply. These return
FEATURE_DISABLEDand can be added later in separate PRs.