Is there a nice way to split a multi-page PDF into its constituent pages? Its a question that comes up more often than you would think. Of course you could point some proprietary software at it, or you could do the job by hand. But there is a lovely free software way to do it, so you would be sort of daft to even consider those options.
pdftk calls itself The PDF Toolkit
and goes on to say:
If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a simple tool for doing everyday things with PDF documents.
Nice. If you don’t have it installed, then follow the install guidelines for your distro or OS. It will even work on Windows, if you are that way inclined. On Debian I can type$ aptitude install pdftk
In reality it was already installed on both boxes where I tested this, so it may very well be a part of the core Squeeze install.
Once you have it installed, you get all the new powers that pdftk enables. Let’s see.
- Merge PDF Documents
- Split PDF Pages into a New Document
- Rotate PDF Pages or Documents
- Decrypt Input as Necessary (Password Required)
- Encrypt Output as Desired
- Fill PDF Forms with FDF Data or XFDF Data and/or Flatten Forms
- Apply a Background Watermark or a Foreground Stamp
- Report on PDF Metrics such as Metadata, Bookmarks, and Page Labels
- Update PDF Metadata
- Attach Files to PDF Pages or the PDF Document
- Unpack PDF Attachments
- Burst a PDF Document into Single Pages
- Uncompress and Re-Compress Page Streams
- Repair Corrupted PDF (Where Possible)
Not too shabby. Now splitting your pdf into its constituent pages is a simple matter of typing the following at the prompt$ pdftk my_multi_page.pdf burst
It is really worth checking out the pdftk manpage or this Linux.com tutorial. But here is a quick list of the top five other things that I often find myself needing to do with PDFs.
5 top ways to manipulate PDFs with pdftk
- Pull some pages out of a PDF but keep the rest
$ pdftk my.pdf cat 1-5 13-21 24-end output my_edited.pdf
- Concatenate (that’s join!) multiple pdfs into one big pdf
$ pdftk part0.pdf part1.pdf cat output the_whole_book.pdf
- Get all of the images and other attachments out of a PDF and put them in a folder
$ pdftk my.pdf unpack_files output ~/attachments_from_my_pdf/
- Print stats and metadata about a PDF
$ pdftk my.pdf dump_data
- Add a background watermark (or a foreground overlay) to a pdf
-
- For a background
$ pdftk my.pdf background bg.pdf output my_watermarked.pdf
- For an overlay
$ pdftk my.pdf stamp overlay.pdf output my_overlayed.pdf
- For a background