docs: document the backup/restore ZIP archive format#604
docs: document the backup/restore ZIP archive format#604irfanuddinahmad wants to merge 2 commits into
Conversation
Adds a reference page describing the TOML-based ZIP format produced by `create_zip_file` / `lp_dump` and consumed by `load_learning_package` / `lp_load`. Covers the full archive layout, every TOML file schema with field-level descriptions and annotated examples drawn from the test fixtures, the XBlock XML placement convention, and quick-start usage snippets for both the management commands and the Python API. Closes openedx#492 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Thanks for the pull request, @irfanuddinahmad! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
There was a problem hiding this comment.
Pull request overview
This PR adds official documentation for the ZIP-based learning-package backup/restore format used by the backup_restore applet, and links it into the openedx_content docs section so operators and developers can understand and inspect archives produced/consumed by lp_dump / lp_load.
Changes:
- Add a new reference page documenting the archive layout and TOML/XML schemas used in backup ZIPs.
- Include export/restore quick-start examples for both management commands and the Python API.
- Link the new page from the
docs/openedx_contentindex.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| docs/openedx_content/index.rst | Adds the new backup/restore format page to the openedx_content docs toctree. |
| docs/openedx_content/backup_restore.rst | New documentation page describing the backup ZIP layout and file formats. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Overview: clarify only draft+published versions exported, not full history - origin_server: free-form string, not validated hostname - [learning_package] heading: note key may be overridden, updated not restored - updated field: mark as reference-only, not applied during restore - [entity.published]: always present (empty table with comment when unpublished) - [[version]]: at most 2 entries — draft first, then published if different - Example: fix version order to draft (v5) first, then published (v4) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@ormsbee did you say that you had a pending review on this, or should I do a close read? |
ormsbee
left a comment
There was a problem hiding this comment.
Thank you for your patience on this review.
| Backup / Restore Format | ||
| ======================= | ||
|
|
||
| The ``backup_restore`` applet lets you export a learning package (V2 content |
There was a problem hiding this comment.
| The ``backup_restore`` applet lets you export a learning package (V2 content | |
| The ``backup_restore`` applet lets you back up a learning package (V2 content |
We're intentionally trying to use "backup/restore" to distinguish it between incremental import/export functionality that we plan to add in the future.
| published versions are exported — the full version history is not preserved. | ||
|
|
||
| The archive uses `TOML <https://toml.io>`_ for all metadata files and keeps the | ||
| actual XBlock content as XML (the same ``block.xml`` format Studio has always |
There was a problem hiding this comment.
| actual XBlock content as XML (the same ``block.xml`` format Studio has always | |
| component XBlock content as XML (the same OLX format Studio has always |
In modulestore, the XML files are not named block.xml. Also, the old XML format is being kept for components (e.g. problems, videos), but not for structural container types like units and subsections.
There was a problem hiding this comment.
Also, it's probably worth noting that the naming is different--in courses, each component would be exported with it's block_id as the name of the file. That's usually a machine-generated ID (since that's the default in Split) but sometimes it's a meaningful identifier when authored by hand. For our export format, it the OLX is always block.xml, and it's the metadata in the parent TOML file that gives the identifier.
| -------- | ||
|
|
||
| A backup ZIP is a self-contained snapshot of one learning package. It captures | ||
| every component, collection, container (sections / subsections / units), and |
There was a problem hiding this comment.
| every component, collection, container (sections / subsections / units), and | |
| every component, collection, container (section / subsection / unit), and |
| Overview | ||
| -------- | ||
|
|
||
| A backup ZIP is a self-contained snapshot of one learning package. It captures |
There was a problem hiding this comment.
We should clarify the difference between a Learning Package and a Library. Namely, that a Library has one and only one Learning Package where it stores its content, but Learning Packages can also stand alone. The restore process creates a temporary Learning Package that can be reviewed by the user, and then later associates that Learning Package with a newly created Library.
| When provided it overrides the ``key`` stored in ``package.toml``, which | ||
| is useful when importing a library under a new reference. |
There was a problem hiding this comment.
We should use stronger language here. It's really dangerous to trust the archive for either the package_ref or the user, and callers should explicitly pass those to load_learning_package unless they really, really know what they're doing.
| title = "Text" | ||
| version_num = 4 | ||
|
|
||
| Container entity TOML (``entities/<slug>.toml``) |
There was a problem hiding this comment.
We should explain what a <slug> is: This is the last part of the entity_ref if there is no collision, but if the last parts of the entity_ref collide (e.g. a Unit and an HTMLBlock that are both "intro"), then a short hash gets appended.
| XBlock content (``component_versions/v<N>/block.xml``) | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| Standard XBlock XML, identical to what Studio stores internally. Static assets |
There was a problem hiding this comment.
There is a difference in HTMLBlock storage. Namely, we don't currently support storing a separate HTML file, so we inline the HTML with CDATA. In courses, we'd have a tiny XML file for the HTMLBlock that pointed to the HTML file.
This is a limitation of our XBlock serialization, but one I hope we can fix before Willow.
Summary
docs/openedx_content/backup_restore.rst— a full reference page for the TOML-based ZIP format produced bylp_dump/create_zip_fileand consumed bylp_load/load_learning_package.docs/openedx_content/index.rst.Closes #492
Test plan
cd docs && make html(ormake dirhtml) — confirms RST renders without Sphinx warningstests/openedx_content/applets/backup_restore/fixtures/library_backup/lp_dumpon a real library and compare the output ZIP layout to the documented structure🤖 Generated with Claude Code