Monday, April 5, 2010

splitting a PDF file into one page chunks

Yesterday I found how to split and merge PDF files using the iText Java library.

Today I will post the split program, which I simplified and cleaned up from an example.

package org.yi.happy.pdf;

import java.io.FileOutputStream;

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfCopy;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;

public class PdfSplitMain {
    public static void main(String[] args) throws Exception {
        if (args.length < 1) {
            System.out.println("use: infile [outbase]");
            return;
        }

        String inFile = args[0];

        String outBase;
        if (args.length < 2) {
            if (inFile.endsWith(".pdf")) {
                outBase = inFile.substring(0, inFile.length() - 4) + "-";
            } else {
                outBase = inFile;
            }
        } else {
            outBase = args[1];
        }

        PdfReader reader = new PdfReader(inFile);
        try {
            for (int i = 1; i <= reader.getNumberOfPages(); i++) {
                String outFile = outBase + String.format("%04d", i) + ".pdf";

                Document document = new Document();
                FileOutputStream output = new FileOutputStream(outFile);
                try {
                    PdfCopy writer = new PdfCopy(document, output);
                    document.open();
                    PdfImportedPage page = writer.getImportedPage(reader, i);
                    writer.addPage(page);
                    document.close();
                    writer.close();
                } finally {
                    output.close();
                }
            }
        } finally {
            reader.close();
        }
    }
}

Basically the input file is opened, and each page from it is written to a separate output file.

After figuring out these programs, with the intention of being able to remix PDF files along page boundaries, I learned that the Preview application of the MAC that I am using can edit PDF files very easily, so I will probably be using that instead most of the time.

No comments:

Post a Comment