Generated image
|| #resource #automation

Terminal PDF Manipulation

Splitting, merging, and compressing PDF files

Splitting PDFs

The gs command can be used to split PDF files.

The following script will create a new PDF file with a sub-set of the pages in the original PDF file:

gs -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf -dFirstPage=1 -dLastPage=3 -sDEVICE=pdfwrite input.pdf

Note: Page numbers are 1-based and lastPage is inclusive. If the first and lage pages match, a single page is output.


If you want to split out every page of the PDF file, this can be done with a loop:

# Replace 10 with the number of pages in the PDF file
# Replace input.pdf with the path to the source PDF file
for i in $(seq 1 10)
do
    printf -v outputFile "%05d_%s" $i output.pdf
    echo "page $i - $outputFile"
    gs -dNOPAUSE -dQUIET -dBATCH -sOutputFile="$outputFile" -dFirstPage=$i -dLastPage=$i -sDEVICE=pdfwrite "input.pdf"
done

Or, for a more complete script:

#!/bin/bash
#
# pdfsplit [input.pdf] [first_page] [last_page] [output suffix]
#
# Example: pdfsplit big_file.pdf 10 20 page.pdf
#
# This script will split the PDF files into single page chunks. Each
# page is saved to a separate file (named [output]) with the page number
# prefixed to it. (e.g. 00015_output.pdf)
#
# Inspired by: https://stackoverflow.com/a/10509904

if [ $# -lt 4 ]
then
    echo "Usage: pdfsplit input.pdf first_page last_page output.pdf"
    exit 1
fi

for i in $(seq $2 $3)
do
    printf -v outputFile "%05d_%s" $i $4
    echo "page $currentPage - $outputFile"
    gs -dNOPAUSE -dQUIET -dBATCH -sOutputFile="$outputFile" -dFirstPage=$i -dLastPage=$i -sDEVICE=pdfwrite "$1"
done

Merging PDFs

The gs command can also be used to merge multiple PDF files into one.

The following command will merge all *_part.pdf files into a single merged.pdf file.

gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=merged.pdf -dBATCH *_part.pdf

Note: The page order in the final PDF will respect the order they were provided to the command.

Compress PDF file

The gs command also has support for compressing a PDF file.

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

Note: Replace the input.pdf and output.pdf paths as desired.

There are three supported levels of compressed:

LevelDescription
/prepress(default) Higher quality output - 300dpi
/ebookMedium quality - 150 dpi
/screenLow quality - 72 dpi (but may be blurry)

This compression mechanism does not always work and may actually result in a larger output file. Another option is to first split the PDF, (maybe try to compress each page), then merge it again. Sometimes just splitting and re-merging will result in a smaller file 🤷️

Convert Image to PDF

I find it's quite common to want to convert an image (e.g. a photo) into a PDF file. The gs command can again help here.

To convert a single image:

gs -dNOSAFER -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf \
    -c \(my-file.jpg\) viewJPEG showpage

Just replace my-file.jpg with the name of the image file.

Note: This process isn't perfect and can lead to PDF files with pages that do not fit their content.

It is also possible to convert multiple files into a PDF by adding more values to the -c flag.

gs -dNOSAFER -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf \
    -c \(my-file-1.jpg\) viewJPEG showpage \
    \(my-file-2.jpg\) viewJPEG showpage \

Maybe try another one?

Reusing Partials in Hugo Avoid code duplication by reusing Hugo partials
2022.08.09
How I fight tab hoarding Addressing my tendency of accumulating dozens of open tabs ⚔️
2024.09.18
Docker: Use Container Env in Docker Exec How to use the container environment in a Docker Exec command
2021.05.09