Skip to content

DEVEX-694: Fix API catalog $ref dereferencing and add $defs deduplication#50

Open
jpage-godaddy wants to merge 5 commits into
rust-portfrom
fix-refs
Open

DEVEX-694: Fix API catalog $ref dereferencing and add $defs deduplication#50
jpage-godaddy wants to merge 5 commits into
rust-portfrom
fix-refs

Conversation

@jpage-godaddy
Copy link
Copy Markdown
Collaborator

@jpage-godaddy jpage-godaddy commented Jun 2, 2026

Summary

  • Fixes broken API catalog schemas: the committed rust/schemas/api/*.json had 587 unresolved external $refs across 16 of 20 domains, making godaddy api describe return broken schema references. The TypeScript generator produced zero external refs; now the Rust generator matches.
  • Adds $defs deduplication: instead of fully inlining every referenced schema at each use site, repeated schemas are lifted into a top-level $defs section and replaced with local {"$ref": "#/$defs/Name"} pointers. This is standard JSON Schema — any compliant reader resolves local refs. orders.json shrinks from ~60K lines to ~4K lines (−94%).

Root causes fixed

  • Local refs resolving to nodes that themselves contained $ref were never recursively dereferenced
  • Some spec repos store models at v2/models/ not v2/schemas/models/; added sibling-directory fallback
  • Spec files authored on macOS had wrong-case filenames (Bulkingestion.yaml vs BulkIngestion.yaml); added case-insensitive path fallback
  • Invalid YAML escape (\.) in double-quoted strings; added preprocessing on parse retry
  • Submodule content (transaction-models) was absent because clone used --depth 1 without --recurse-submodules; fixed

Size reduction from $defs deduplication

File Before (TS original) After Reduction
orders.json 59,837 lines 3,888 lines −94%
bulk-operations.json 17,592 lines 2,512 lines −86%
transactions.json 31,562 lines 4,122 lines −87%
metafields.json 11,939 lines 1,753 lines −85%

Schemas are include_str!'d at compile time, so smaller files mean a smaller binary.

Test plan

  • cargo test — 114 tests pass
  • cargo clippy -- -D warnings — clean
  • cargo fmt --check — clean
  • CI static check: grep -r '"[$]ref"' rust/schemas/api/*.json | grep -v '"#/' | wc -l returns 0
  • godaddy api describe orders — returns schemas with $ref: "#/$defs/..." local references (no broken file-path refs)

🤖 Generated with Claude Code

**Fix: unresolved external $refs (was: 587 broken refs across 16 domains)**

The generator was silently preserving unresolved $refs on failure. Root
causes fixed:
- Local refs resolving to nodes that themselves contained $ref were never
  recursively dereferenced; now a second dereference pass is applied
- Some spec repos store models at v2/models/ not v2/schemas/models/; added
  sibling-directory fallback in resolve_ref()
- Spec files authored on macOS had wrong-case filenames (e.g.
  Bulkingestion.yaml vs BulkIngestion.yaml); added case-insensitive path
  resolution fallback
- Invalid YAML escape (\.) in double-quoted strings; added preprocessing
  on parse retry
- Submodule content (transaction-models) was absent because clone used
  --depth 1 without --recurse-submodules; fixed

Added collect_unresolved_refs() post-generation validator that fails the
build loudly with a full list of unresolved refs instead of silently
producing broken output.

Added CI static check in cicd.yml that greps committed schemas for
unresolved external $refs on every PR.

**Improvement: $defs deduplication (was: Metafield inlined 167×, Error 136×)**

External file refs and local #/components/schemas/ refs are now lifted into
a top-level $defs section instead of being fully inlined at every use site.
The same ref URL produces one $defs entry; all occurrences become local
{"$ref": "#/$defs/Name"} pointers. This is standard JSON Schema — any
compliant reader resolves local refs.

Result: orders.json 59,837 → 3,888 lines (-94%), bulk-operations.json
17,592 → 2,512 lines (-86%). Smaller catalog files mean a smaller binary
(schemas are include_str!'d at compile time).

Added indexmap dependency for stable $defs insertion order.
Added 8 regression tests covering all new behaviors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Rust API catalog generator to (1) fully dereference previously-broken $refs in upstream OpenAPI/JSON Schema sources and (2) reduce committed schema size by deduplicating repeated schemas into a top-level $defs with local #/$defs/... references, aligning Rust output with the TypeScript generator.

Changes:

  • Fix $ref dereferencing by recursively resolving referenced nodes (including nested $refs), adding repo layout fallbacks, case-insensitive file lookup, YAML parse retry normalization, and cloning submodules.
  • Add $defs lifting/deduplication so repeated schemas are emitted once and referenced locally.
  • Add CI enforcement to prevent committed schemas from containing unresolved external $refs.

Reviewed changes

Copilot reviewed 11 out of 25 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
rust/tools/generate-api-catalog/src/main.rs Implements recursive $ref resolution, $defs lifting/deduplication, repo/YAML fallbacks, and adds focused unit tests.
rust/tools/generate-api-catalog/Cargo.toml Adds indexmap for deterministic $defs tracking and tempfile for tests.
rust/Cargo.lock Locks new dependencies (indexmap, tempfile).
.github/workflows/cicd.yml Adds a CI check to fail if committed API schemas contain unresolved external $refs.
rust/schemas/api/taxes.json Regenerated catalog output using $defs and local refs.
rust/schemas/api/shipping.json Regenerated catalog output using $defs and local refs (major size/duplication reduction).
rust/schemas/api/recommendations.json Regenerated catalog output using $defs and local refs.
rust/schemas/api/metafields.json Regenerated catalog output using $defs and local refs (major size/duplication reduction).
rust/schemas/api/location-addresses.json Regenerated catalog output using $defs and local refs.
rust/schemas/api/manifest.json Updates generated timestamp and domain entries ordering after regeneration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rust/tools/generate-api-catalog/src/main.rs Outdated
Comment thread rust/tools/generate-api-catalog/src/main.rs
Comment thread .github/workflows/cicd.yml Outdated
- Escape $defs keys as JSON Pointer segments (RFC 6901) when formatting
  $ref strings; keys containing ~ or / would otherwise resolve incorrectly
- Handle \" inside YAML double-quoted scalars in fix_yaml_invalid_dot_escapes
  so an escaped quote doesn't prematurely terminate the scalar
- Rewrite CI $ref check using `if grep | grep -v; then exit 1; fi` to avoid
  pipefail triggering on zero matches (grep exits 1 when nothing matches)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 25 changed files in this pull request and generated 4 comments.

Comment thread rust/tools/generate-api-catalog/src/main.rs Outdated
Comment thread rust/schemas/api/location-addresses.json Outdated
Comment thread rust/schemas/api/location-addresses.json Outdated
Comment thread rust/schemas/api/metafields.json
- Scan $defs values (not just spec) for unresolved external $refs after
  dereferencing, so the generator fails early if any slip through
- Wire response schema to lifted $defs ref in process_responses: when a
  response object is a {"$ref": "#/$defs/X"} pointer, populate the
  schema field so api describe exposes the error shape
- Regenerate schemas: fixes customer-profiles $ref pointer escaping
  (json_pointer_escape applied correctly now), and location-addresses
  error responses now include schema field pointing to $defs.error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 25 changed files in this pull request and generated 3 comments.

Comment thread rust/tools/generate-api-catalog/src/main.rs
Comment thread rust/tools/generate-api-catalog/src/main.rs
Comment thread .github/workflows/cicd.yml
- Fix derive_defs_key_for_path to use component name from fragment for
  external refs like ./common.yaml#/components/schemas/Foo, preventing
  key collisions when two components from the same file get the same stem
- Preserve original YAML parse error in retry fallback: first error is
  now included in the failure message instead of being silently discarded
- Add set -o pipefail to CI ref-check step for reliable pipeline failure
  propagation
- Add test cases for fragment-based key derivation
- Regenerate schemas to pick up upstream percentage $id update

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 25 changed files in this pull request and generated 1 comment.

Comment thread .github/workflows/cicd.yml
The previous grep -v '"#/' filter used a substring match, so a ref like
"./common.yaml#/components/schemas/Foo" (which contains "#/" in its
fragment) would be incorrectly excluded and an unresolved external ref
would go undetected.

Replace the two-grep pipeline with a single pattern that directly matches
$ref values whose value starts with a non-# character, which is the
correct semantics for "this is an external ref".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 25 changed files in this pull request and generated no new comments.

@jpage-godaddy jpage-godaddy changed the title Fix API catalog $ref dereferencing and add $defs deduplication DEVEX-694: Fix API catalog $ref dereferencing and add $defs deduplication Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants