Skip to content

Improve shared chat scraping errors and coverage#170

Open
TsukinowaRin wants to merge 2 commits into
XortexAI:mainfrom
TsukinowaRin:codex/context-share-errors
Open

Improve shared chat scraping errors and coverage#170
TsukinowaRin wants to merge 2 commits into
XortexAI:mainfrom
TsukinowaRin:codex/context-share-errors

Conversation

@TsukinowaRin
Copy link
Copy Markdown

Addresses #155.

Summary

  • keep the existing ChatGPT, Claude, and Gemini shared-chat extraction path covered with focused parser tests
  • return provider-specific guidance when a known share link is private, expired, missing, or otherwise not extractable
  • add the missing python-multipart runtime dependency required by the FastAPI upload route at import time

Verification

  • uv run --extra dev pytest tests/test_chat_share_extraction.py
  • uv run --extra dev ruff check tests/test_chat_share_extraction.py

Note: full ruff check still reports pre-existing import-order/unused-import issues in server.py and src/api/routes/memory.py; this PR keeps those unrelated files scoped to the scrape error behavior.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the chat scraping functionality by introducing provider-specific error messages for ChatGPT, Claude, and Gemini, and adds a comprehensive test suite for the extraction logic. Feedback highlights the duplication of the _scrape_failure_message function across modules and identifies an inconsistency in src/api/routes/memory.py where error responses lack accurate elapsed_ms timing data.

Comment thread src/api/routes/memory.py Outdated
Comment on lines +310 to +327
def _scrape_failure_message(result: Dict[str, Any]) -> str:
provider = result.get("provider") or "unknown"

if provider in {"chatgpt", "claude", "gemini"}:
display_name = {
"chatgpt": "ChatGPT",
"claude": "Claude",
"gemini": "Gemini",
}[provider]
return (
f"Could not extract messages from this {display_name} share link. "
"Make sure the link is public, still exists, and has not expired."
)

return (
"Failed to extract messages from the provided link. "
"Supported public share links are ChatGPT, Claude, and Gemini."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _scrape_failure_message function is duplicated between src/api/routes/memory.py and server.py. To improve maintainability and ensure consistent error messages across the application, consider centralizing this logic in a shared utility module or exporting it from one of the files. This avoids the need to update multiple locations when adding support for new chat providers.

Comment thread src/api/routes/memory.py
Comment on lines 779 to 783
if not pairs:
return _error(request, "Failed to extract messages from the provided link.", 400)
return _error(request, _scrape_failure_message(result), 400)

data = ScrapeResponse(pairs=pairs)
elapsed = round((time.perf_counter() - start) * 1000, 2)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error response for missing message pairs is missing the elapsed_ms timing information, which results in a default value of 0.0. This is inconsistent with the success response and the implementation in server.py. Calculating the elapsed time before the check ensures that the user receives accurate timing even when extraction fails.

Suggested change
if not pairs:
return _error(request, "Failed to extract messages from the provided link.", 400)
return _error(request, _scrape_failure_message(result), 400)
data = ScrapeResponse(pairs=pairs)
elapsed = round((time.perf_counter() - start) * 1000, 2)
elapsed = round((time.perf_counter() - start) * 1000, 2)
if not pairs:
return _error(request, _scrape_failure_message(result), 400, elapsed)
data = ScrapeResponse(pairs=pairs)

@TsukinowaRin TsukinowaRin force-pushed the codex/context-share-errors branch from 317b53d to 3f793f2 Compare May 11, 2026 06:51
@ishaanxgupta
Copy link
Copy Markdown
Member

Hi @TsukinowaRin thanks for the contribution, could you share a video of the working functionality with gemini/claude link?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants