Charlie Harvey

Split multi-page PDFs into single page PDFs on GNU/Linux with pdftk

Is there a nice way to split a multi-page PDF into its constituent pages? Its a question that comes up more often than you would think. Of course you could point some proprietary software at it, or you could do the job by hand. But there is a lovely free software way to do it, so you would be sort of daft to even consider those options.

pdftk calls itself The PDF Toolkit and goes on to say:

If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a simple tool for doing everyday things with PDF documents.

Nice. If you don’t have it installed, then follow the install guidelines for your distro or OS. It will even work on Windows, if you are that way inclined. On Debian I can type$ aptitude install pdftkIn reality it was already installed on both boxes where I tested this, so it may very well be a part of the core Squeeze install.

Once you have it installed, you get all the new powers that pdftk enables. Let’s see.

  • Merge PDF Documents
  • Split PDF Pages into a New Document
  • Rotate PDF Pages or Documents
  • Decrypt Input as Necessary (Password Required)
  • Encrypt Output as Desired
  • Fill PDF Forms with FDF Data or XFDF Data and/or Flatten Forms
  • Apply a Background Watermark or a Foreground Stamp
  • Report on PDF Metrics such as Metadata, Bookmarks, and Page Labels
  • Update PDF Metadata
  • Attach Files to PDF Pages or the PDF Document
  • Unpack PDF Attachments
  • Burst a PDF Document into Single Pages
  • Uncompress and Re-Compress Page Streams
  • Repair Corrupted PDF (Where Possible)

Not too shabby. Now splitting your pdf into its constituent pages is a simple matter of typing the following at the prompt$ pdftk my_multi_page.pdf burst

It is really worth checking out the pdftk manpage or this Linux.com tutorial. But here is a quick list of the top five other things that I often find myself needing to do with PDFs.

5 top ways to manipulate PDFs with pdftk

Pull some pages out of a PDF but keep the rest
$ pdftk my.pdf cat 1-5 13-21 24-end output my_edited.pdf
Concatenate (that’s join!) multiple pdfs into one big pdf
$ pdftk part0.pdf part1.pdf cat output the_whole_book.pdf
Get all of the images and other attachments out of a PDF and put them in a folder
$ pdftk my.pdf unpack_files output ~/attachments_from_my_pdf/
Print stats and metadata about a PDF
$ pdftk my.pdf dump_data
Add a background watermark (or a foreground overlay) to a pdf
  • For a background$ pdftk my.pdf background bg.pdf output my_watermarked.pdf
  • For an overlay$ pdftk my.pdf stamp overlay.pdf output my_overlayed.pdf


Comments

  • Be respectful. You may want to read the comment guidelines before posting.
  • You can use Markdown syntax to format your comments. You can only use level 5 and 6 headings.
  • You can add class="your language" to code blocks to help highlight.js highlight them correctly.

Privacy note: This form will forward your IP address, user agent and referrer to the Akismet, StopForumSpam and Botscout spam filtering services. I don’t log these details. Those services will. I do log everything you type into the form. Full privacy statement.