Splitting PDFs vertically to turn double-page into single-page
I’ve been looking for a long time trying to find a script to turn double page scanned PDFs into single page PDFs, specifically so I can read them on my iPod Touch using the excellent GoodReader.
I sat down to write a question about this on Super User. But in the process of doing so, I managed to formulate the perfect Google query - ‘split double page pdf’ (no quotes). I downloaded this Perl script, installed the PDF::API2 CPAN module and it just worked as expected.
Before finding this, I investigated a whole load of commercial apps, including Acrobat and a whole bunch of $49 Windows apps promoted by having stock photos of efficient-looking business guys in suits. And because that whole area has a bad whiff of conman tactics about it, I started investigating using iText, the Java PDF manipulation library, to write my own. A thirty-line Perl script does the job much better though.
There is only one slight downside I’ve found - it does rather inflate the file sizes. I had a 17Mb PDF file, which I then split using the Perl script. It’s now 36Mb. It doesn’t bother me too much, but it seems like it might be the case that the cropped data is only being hidden away rather than actually removed from the file, then duplicated on the following page. I also tend to have to open the finished file in Preview.app and remove the odd page or two. I am tempted to modify the Perl script to include some ‘ignore’ pages. Basically, what it would need to do is to do a quick statistical run through of the pages before hand and check the sizing of the pages - so the odd few pages at the start which are funnily sized (title pages etc.) don’t get treated the same way as the rest of the document.
Anyway, I can now read various scanned papers on my iPod, so that’s all good. I hope others in similar predicaments can use this to make their PDFs readable on small devices like iPods.