Skip to content

Refactor: Deduplicate normalize_file_path and to_epoch_ms in export script (closes #46)#51

Merged
wpak-ai merged 4 commits into
masterfrom
fix/duplicated-normalize-file-path
May 19, 2026
Merged

Refactor: Deduplicate normalize_file_path and to_epoch_ms in export script (closes #46)#51
wpak-ai merged 4 commits into
masterfrom
fix/duplicated-normalize-file-path

Conversation

@bradjin8
Copy link
Copy Markdown
Collaborator

@bradjin8 bradjin8 commented May 18, 2026

Closes #46

Problem

scripts/export.py maintained private copies of normalize_file_path and to_epoch_ms that already existed in utils/path_helpers.py. The two normalize_file_path copies had diverged: the export copy was missing the non-Windows cross-platform branch that handles Windows-style drive paths (D:/..., D:\...) on Linux/macOS — meaning cross-platform reads of Cursor's Windows workspaceStorage could produce inconsistent path keys depending on which code path was used.

Changes

scripts/export.py

  • Removed local normalize_file_path (~13 lines) and to_epoch_ms (~22 lines)
  • Expanded the utils.path_helpers import to include both functions alongside the existing get_workspace_folder_paths alias

tests/test_normalize_file_path.py (new)

  • 21 test cases in this file (12 normalize_file_path, 9 to_epoch_ms; 2 skip on win32 for POSIX passthrough only) covering the shared implementation:
    • file:/// and file:// URI stripping
    • Percent-encoded characters: %20 (space), %23 (#), %3A (colon in URI-style prefix)
    • Windows drive paths with backslashes and forward-slashes on all platforms
    • Drive-letter lowercasing
    • Plain POSIX path passthrough
    • Empty string

Notes

  • The canonical implementation in utils/path_helpers.py is a strict superset of the old export copy — it adds the non-Windows ^[a-zA-Z]:[/\\] branch for Windows-style paths on Linux/macOS, which the export copy lacked.
  • The sys.path.insert block in scripts/export.py is intentionally left in place; removing it is part of the larger export-script-to-service-layer refactor (issue Add an optional parameter to disable rendering/exporting sensitive projects/chats #1).
  • samples/example_chat_export.md quotes the old function definitions as documentation; not touched.

Verification

python -m unittest tests.test_normalize_file_path -v
# Ran 21 tests … OK (skipped=2)   # win32: 2 POSIX-only cases skip

python -m unittest discover tests -q
# Ran 274 tests in 2.1s
# OK (skipped=4)   # suite-wide; includes skips from other modules

Summary by CodeRabbit

  • Refactor

    • Consolidated utility functions into shared modules.
  • Tests

    • Significantly expanded test coverage for path normalization and timestamp conversion utilities. Added comprehensive test cases covering cross-platform compatibility, percent-decoding, URI handling, and various timestamp input formats including ISO-8601 and numeric values.

Review Change Stack

@bradjin8 bradjin8 self-assigned this May 18, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6038296c-7fec-425d-85c6-a8f33c30095b

📥 Commits

Reviewing files that changed from the base of the PR and between 011f6e8 and 88b7b10.

📒 Files selected for processing (1)
  • tests/test_normalize_file_path.py

📝 Walkthrough

Walkthrough

This PR imports normalize_file_path and to_epoch_ms from utils.path_helpers into scripts/export.py (removing local copies) and adds tests/test_normalize_file_path.py with tests for URI stripping, percent-decoding, Windows drive normalization, and POSIX passthrough.

Changes

Path Normalization Deduplication and Testing

Layer / File(s) Summary
Import shared helper functions
scripts/export.py
Updated imports to pull get_workspace_folder_paths, normalize_file_path, and to_epoch_ms from utils.path_helpers, removing local duplicates of normalize_file_path and to_epoch_ms.
Test coverage for normalize_file_path and to_epoch_ms
tests/test_normalize_file_path.py
Added a new test module with tests for URI scheme stripping (file:///, file://, empty input), percent-decoding (%20, %23, %3A with platform differences), Windows drive normalization (backslash/forward-slash handling, drive-letter lowercasing, file:/// with drives, mixed-case drives), POSIX passthrough behavior (skipped on win32 where appropriate), and to_epoch_ms conversions and edge cases.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through paths and tidied the trail,
One helper now governs each file and each tale.
URIs unwrapped, drives humbled to case,
Percent codes resolved, no more duplicate place.
🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main refactoring change: deduplicating normalize_file_path and to_epoch_ms by removing local implementations from export.py and importing from utils.path_helpers.
Linked Issues check ✅ Passed All acceptance criteria from issue #46 are met: single canonical normalize_file_path in utils.path_helpers, scripts/export.py imports both normalize_file_path and to_epoch_ms from utils.path_helpers, comprehensive test coverage added for Windows paths/URIs/percent-encoding/edge cases, and audit for duplicates (to_epoch_ms) completed.
Out of Scope Changes check ✅ Passed All changes are directly related to the stated objective of deduplicating normalize_file_path and to_epoch_ms. The three files modified (scripts/export.py, tests/test_normalize_file_path.py, and samples/example_chat_export.md) all support the refactoring goal with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/duplicated-normalize-file-path

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_normalize_file_path.py`:
- Around line 38-42: The test_space_decoded currently expects mixed-case "My
Documents" on non-Windows but lowercased on Windows, which is inconsistent with
other Windows-drive normalization tests; update test_space_decoded to always
expect the lowercased form for the Windows-drive URI case by asserting that the
returned path contains "my documents" (and still assertNotIn("%20")), so the
expectation matches normalize_file_path behavior for "file:///C:/..." inputs and
aligns with the other Windows-drive tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3977d42e-d03a-4ed2-9fa3-5cf0189f866e

📥 Commits

Reviewing files that changed from the base of the PR and between 91a8d6e and fb5d9e4.

📒 Files selected for processing (2)
  • scripts/export.py
  • tests/test_normalize_file_path.py

Comment thread tests/test_normalize_file_path.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_normalize_file_path.py`:
- Around line 80-84: The test test_file_uri_with_windows_drive currently accepts
either lowercase or uppercase drive letters but the normalization is documented
and other tests expect lowercase; update the assertion in
test_file_uri_with_windows_drive to require a lowercase drive prefix by
asserting out.startswith("c:") (or otherwise enforce out[0].islower() for the
drive letter) so the test fails if normalize_file_path stops lowercasing drive
letters; reference normalize_file_path and the test_file_uri_with_windows_drive
function when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c1a6a08-5ed1-40af-8523-e92774802105

📥 Commits

Reviewing files that changed from the base of the PR and between fb5d9e4 and a3688c0.

📒 Files selected for processing (1)
  • tests/test_normalize_file_path.py

Comment thread tests/test_normalize_file_path.py Outdated
@bradjin8 bradjin8 requested a review from timon0305 May 18, 2026 19:10
Comment thread tests/test_normalize_file_path.py
Comment thread tests/test_normalize_file_path.py
Comment thread tests/test_normalize_file_path.py
Comment thread tests/test_normalize_file_path.py
@bradjin8 bradjin8 requested a review from timon0305 May 19, 2026 05:14
@timon0305 timon0305 requested a review from wpak-ai May 19, 2026 12:49
@wpak-ai wpak-ai merged commit bbaf171 into master May 19, 2026
7 checks passed
@wpak-ai wpak-ai deleted the fix/duplicated-normalize-file-path branch May 19, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicated normalize_file_path in Export Script

3 participants