PyCropPDF is a desktop application designed to crop and mask PDF files. It is particularly suited for documents where multiple pages share a common layout structure—such as scanned books, academic papers, or reports—allowing you to remove margins, headers, footers, or watermarks across many pages simultaneously.
Instead of showing pages one-by-one, PyCropPDF renders and overlays page previews transparently on top of each other. This enables you to visually check that a selected crop boundary or whiteout mask fits every page in the document (or group of pages) without clipping text or diagrams.
- All Pages (Overlay): Overlays all document pages together. Helpful for verifying margins across the entire document.
- Odd / Even Pages: Splits the preview into two separate overlays—one for odd pages and one for even pages. This is useful for double-sided documents (like bound books) where the left and right margins alternate.
- Single Page Preview: Focuses on a single page, allowing you to fine-tune boundaries for specific layout exceptions.
- Crop Box: Defines a bounding box. When applied, pages are cropped to this box (modifying the PDF page's physical boundaries).
- Whiteout Mask: Applies solid rectangular overlays (masks) to cover page content. You can choose a custom color to match the page background.
A sidebar displays thumbnails of all pages.
- Multi-selection is supported using
Ctrl + Click(toggle individual pages) andShift + Click(select a range of pages). - Operations (cropping and whiteout) can be limited to the selected pages.
- Selected pages can be deleted from the document.
- Operations can be undone step-by-step.
When you save a modified PDF, the application generates a JSON sidecar manifest (e.g., document_modified.pdf.pycroppdf.json). This manifest records:
- SHA-256 hashes of the source and output PDF files.
- Original page counts and mapping of output pages to original pages.
- Explicit lists of deleted page indices.
- Exact coordinates and dimensions of crops and whiteouts applied.
This manifest enables downstream automated pipelines to trace the history and modifications of the edited PDF back to its original source.
- Python 3.8 or higher.
You can install PyCropPDF directly using pip:
pip install pycroppdfTo clone the repository and install it in editable mode for development:
git clone https://github.com/lukaszliniewicz/PyCropPDF.git
cd PyCropPDF
pip install -e .Start the graphical interface from the terminal:
pycroppdfTo start the interface with a PDF already loaded:
pycroppdf --input /path/to/document.pdfThe application accepts arguments to pre-configure paths and manifest outputs:
--input /path/to/file.pdf: Loads the specified PDF at launch.--save-to /path/to/directory/: Specifies the folder where the edited PDF will be saved.--save-as filename.pdf: Specifies the filename for the output PDF.--manifest-out /path/to/output.json: Overrides the default location for the JSON provenance manifest.
Note: If --save-to or --save-as is specified, the standard "Save File" dialog is bypassed. Clicking "Save PDF..." immediately saves the file to the pre-configured path.
- Open a PDF: Drag and drop a PDF file into the window, or go to File > Open PDF....
- Select View Mode: Choose All, Odd/Even, or double-click a page thumbnail to view a single page.
- Apply a Crop: Select the Crop Box tool, click and drag a box on the canvas, and click Apply Crop. If odd/even mode is active, you can define separate boxes for odd and even pages.
- Apply a Whiteout: Select the Whiteout tool, click and drag to cover text/images.
- Manage Pages: Select thumbnails in the sidebar to delete unnecessary pages.
- Undo Changes: Use the Undo button or
Ctrl + Zto revert your actions. - Save: Click Save PDF to export the modified document and its provenance manifest.
Install pytest and execute it from the project root:
pytestYou can compile PyCropPDF into a standalone executable using pyinstaller. From your environment, run:
pyinstaller --onefile --noconsole --name PyCropPDF run.pyThis produces a single, self-contained executable file inside the dist/ directory.
This project is licensed under the MIT License. See the LICENSE file for details.
