Adobe’s PDF has become a standard for text documents. Most office products can export their content into PDF. However, this software reaches its limits if you want advanced tasks such as combining different PDF documents into one single document or adding and adjusting the bookmarks panel for better navigation. Imagine that you want to collect all relevant Perl.com articles in one PDF file with an up-to-date bookmarks panel. You could use a tool like HTMLDOC, but adding article number 51 would require you to fetch articles one through 50 from the Web again. In most cases you would not be satisfied by the resulting bookmarks panel, either. This article shows how to use PDF::Reuse, by Lars Lundberg, for combining different PDF documents and adding bookmarks to them.
Create or modify PDF with a Perl script. Many web sites use Perl for creating dynamic content. You can also use Perl to script Acrobat on your local machine Hack #95. Given the great number of packages that extend Perl, it is no surprise that packages exist for creating and manipulating PDF. Let's take a look. Essential Perl Page: 1 Essential Perl This document is a quick introduction to the Perl language. Perl has many features, but you can get pretty far with just the basics, and that's what this document is about. Create PDF files. Ppm install PDF-Create. + How to install PDF-Create. Download and install ActivePerl. Open Command Prompt. Type ppm install PDF-Create.
You can take advantage of what has been done before, it is not necessary to start from scratch every time you create a PDF-file. You use old PDF-files as a source for forms, images, fonts and texts. The components are taken as they are, or rearranged, and you add your own texts and you produce new output. Perl is widely known as 'the duct-tape of the Internet'. Perl can handle encrypted Web data, including e-commerce transactions. Perl can be embedded into web servers to speed up processing by as much as 2000%. Perl's modperl allows the Apache web server to embed a Perl interpreter.
Although its capabilities are limited in this area, you can also use
PDF::Reuse to create PDF documents. If you want to create more sophisticated documents you should investigate other PDF-packages like PDF::API2 from Alfred Reibenschuh or Text::PDF from Martin Hosken. However
PDF::Reuse is sufficient to create a simple PDF to use in later examples. The following listing should be rather self explanatory.
Open a new file with with
prFile($filename) and close it with
prEnd. Between those two calls, add text with the
prText command. You can also draw graphics by using the low level command
prAdd with plain PDF markup as parameter. Start a new page with
prFile starts the first one automatically, so you need to add a new page only if your document has more than one page. Be aware that for the
prText(x,y,Text) command the origin of the coordinate system is on the left bottom of the page.
As an example of adding PDF markup with
prAdd, the code creates a red rectangle with blue borders. In case you would like to add more graphics or complex graphics to your PDF, you can study the examples of the PDF reference manual. If so, consider switching to
Text::PDF instead of using
prAdd, as they both provide a comfortable layer of abstraction over the PDF markup language.
Combining PDF Documents
PDF::Reuse’s main strength is the modification and reassembling of existing PDF documents. The next example assembles a new file from the example material.
prFile($filename) opens the file. Next, the
prDoc($filename, $firstPage, $lastPage) calls add to the new file various page ranges from the example file. The arguments
$lastPage are optional. Omit both to add the entire document. If only
$firstPage is present, the call will add everything from that page to the end. Finally,
prEnd closes the file.
Reusing Existing PDF Files
PDF::Reuse it is possible to use existing PDF files as templates for creating new documents. Suppose that you have a file customer.txt containing a list of customers to whom to send a letter. You’ve used a tool to create a PDF document, such as OpenOffice.org or Adobe Acrobat, to produce the letter itself. Now you can write a short program to add the date and the names and addresses of your customers to the letter.
After opening the file with
prFile, the call to
prCompress(1) enables PDF compression.
prFont sets the file’s font. The always-available options are Times-Roman, Times-Bold, Times-Italic, Times-BoldItalic, Courier, Courier-Bold, Courier-Oblique, Courier-BoldOblique, Helvetica, Helvetica-Bold, Helvetica-Oblique, and Helvetica-BoldOblique. Set the font size with
prFontSize. The default font is Helvetica, with 12 pixel size.
The rest of the code is a simple loop over the file containing the customer data to filling the template with
Adding Page Numbers
Sometimes you need only make a small change to a document, such as adding missing page numbers.
prSinglePage takes one page after the other from an existing PDFdocument and returns the number of remaining pages after each invocation.
Low-Level PDF Commands
If you know low-level PDF instructions, you can add them with with the
PDF::Reuse will perform no syntax checks on the instructions, so refer to the PDF reference manual. Here’s an example of printing colored rectangles with the
Working with PDF files becomes comfortable if the document has bookmarks with a table of contents-like structure. Some applications either can’t provide the PDF document with bookmarks or support insufficient or incorrect bookmarks.
PDF::Reuse can fill this gap with the
A bookmark reference is a hash or a array of hashes that looks like:
PDF::Reuse that fixes this issue.
Other examples for using
PDF::Reuse, including image embedding, are available in the PDF::Reuse::Tutorial.
A Console Application for Combining PDF Documents
To avoid editing the Perl code for combining PDF documents every time you want to merge documents, I’ve written a console application that takes the names of the input files and the page ranges for each file as arguments. That’s easy to reuse in a graphical application using Perl/Tk, so I’ve put that code in a separate Perl module called
CombinePDFs. The command-line application will interact with this package instead of directly working on
PDF::Reuse. The following diagram shows the relationship between the Packages, example, and applications.
The application app-combine-console-pdfs.pl does not deal directly with
PDF::Reuse but parses the command line arguments with Getopt::Long written by Johan Vromans. This is the standard package for this task. Here it parses the input filenames and the page ranges into two arrays of same length. The user also has to supply a filename for the output and, optionally, a bookmarks file. The main subroutine that parses the command line arguments and executes
If the user passes an insufficient number of arguments, invalid filenames, or incorrect page ranges, the code invokes the the usage subroutine. It also gets invoked if the user asks explicitly for
-help on the command line. Any good command line application should be written that way.
Getopt::Long can distinguish between mandatory arguments, with
= as the symbol after the argument name (infile, pages), optional arguments, with
: (bookmarks), or flags (overwrite, usage), without a symbol. It can store these arguments as arrays (infile, pages), hashes, or scalars. It also supports type checking.
The application itself mainly performs error checking. If everything is fine, it calls the
CombinePDFs::createPDF subroutine, passing the array of input files, the array of page ranges, and the bookmarks information. The bookmarks scalar is optional.
Page ranges can be comma-separated ranges (
1-11,14,17-23), single pages, or the
all token. You can include the same page several times in the same document.
The file-checking code looks for read permissions and tests if the file is a PDF document by using the
CombinePDFs::isPDF($filename) subroutine. Although PDF, by Antonio Rosella, also provides such a method, this package was not developed with the
use strict pragma and gives a lot of warnings. Furthermore, the package is not actively maintained, so there seems to be no chance to fix this in the near future. Implementing the
isPDF subroutine is quite simple; it reads the first line of the PDF file and checks for the magic string
%PDF-1.[0-9] in the first line of the document.
Please note that
PDF::Reuse is not an object oriented package. Therefore the
CombinePDFs package is not object oriented, either. A user of this package could create several instances, but all instances work on the same PDF file.
Submitting complex data structures via the command line is a difficult issue, so I decided that bookmarks should come from a text file. This file has a simple markup to reflect a tree structure, where each line resembles:
The level starts with 0 for root bookmarks. Children of the root bookmarks have a level of 1, their children a level of 2, and so on. Currently, the system supports bookmarks up to three levels of nesting:
The parsing subroutine for the bookmarks file
CombinePDFs::addBookmarks($filename) should be easy to understand, though that’s not necessarily true of the complex data structure created inside this subroutine.
Perl Script Create Pdf
Bookmarks are an array of hashes.
addBookmarks uses several attributes.
text One nation under gold pdf free download free. is the title of the entry in the bookmarks panel.
act is the action to trigger when someone clicks the entry. Here it is the page number to open.
kids contains a reference to the children of this bookmark entry. During the loop over the file content, the code searches for each level the last entry in a variable and pushes its related children on those last entries. The root bookmarks get collected as an array, and the loop adds the children as a reference to an array, and so on for the grand children. The result is a nested complex data structure which stores all children in the
kids attribute of the parent’s bookmarks hash—an array of hashes containing other arrays of hashes and so on.
The parsing subroutine for the bookmarks file
CombinePDFs::addBookmarks($filename) collects bookmarks in a array of hashes. At the end, it adds the bookmarks to the document with
prBookmarks($reference). All of this means that you can use a bookmarks file with the PDF file with a command line like:
Currently, you must open the document’s navigation panel manually because
PDF::Reuse does not yet allow you to declare a default view, whether full screen or panel view. This is easy to fix, and the author Lars Lundberg has promised me to do so in a next release of
PDF::Reuse. In order to enable this feature until a new release will appear I included a modified version of
PDF::Reuse in the examples zip file that accompanies this article.
act key with a
page key using the appropiate page number and scroll options:
Then print the bookmarks to the PDF document as usual with
Tk Application to Combine PDF Documents
Console applications are fine for experienced users, but you can’t expect that all users belong to this category. Therefore it might be worth it to write a GUI for combining PDF documents. The Perl/Tk toolkit founded on the old Tix widgets for Tcl/Tk is not very modern, although this might change with the Tcl/Tk release 8.5 and the Tile widgets—but it is very portable. That’s why I used it for the GUI example. Because I put a layer between the
PDF::Reuse package and the command line application with the
CombinePDFs package, it was easy to reuse those parts in the Tk-application app-combine-tk-pdfs.pl.
With the Tk application, the user visually selects PDF files, orders the files in a
Tk::Tree widget, and changes the page ranges and the bookmarks text in
Tk::Entry fields. Furthermore, the application can store the resulting tree structure inside a session file and restored that later on. It’s also possible to copy and paste entries inside the tree, which makes it easy to create a bookmarks panel for single files without using bookmark files. The Tk application can be found in the download at the end of this article.
Beside the final PDF file, the application creates a file with the same basename and the .cnt extension. This file contains the bookmarks for the PDF. It’s also useful to continue the processing of the combined PDF file instead of reassembling all the source files again. The entry for this feature is
When loading a bookmarks file, the same extension convention is in place.
Other PDF Packages on CPAN
PDF::Reuse, but there are several other options for PDF creation and manipulation on the CPAN.
- PDF::API2, by Alfred Reibenschuh, is actively maintained. It is the package of choice if creating new PDF documents from scratch.
- PDF::API2::Simple, by Red Tree Systems, is a wrapper over the
PDF::API2module for users who find the PDF::API2 module to difficult to use.
- Text::PDF, by Martin Hosken, can work on more than PDF file at the same time and has Truetype font support.
- CAM::PDF, by Clotho Advanced Media, is like
PDF::Reusemore focused on reading and manipulating existing PDF documents. However, it can work on multiple files at the same time. Use it if you need more features than
PDF::Reuse is a well-written and well-documented package, which makes it easy to create, combine, and change existing PDF documents. The two sample applications show some of its capabilities. Two limitations should be mentioned however,
PDF::Reuse can’t reuse existing bookmarks, and after combining different PDF documents some of the inner document hyperlinks might stop working properly. The example source code for the applications, packages, and the modified
PDF::Reuse is available.
Perl Create Pdf
- What is mod_perl?
- Success Stories
- mod_perl is the power behind many of the Internet's busiest and mostadvanced web sites. Listed here are success stories from people usingmod_perl; also, world-wide statistics of mod_perl usage
- Get source and binary mod_perl distributions and download the documentation
- The mod_perl project features a lot of documentation, both formod_perl 1.0 and 2.0. If there is anything you need to learn aboutmod_perl, you'll learn it here.
- Reporting Bugs
- Before a bug can be solved, developers need to be able to reproduceit. Users need to provide all the relevant information that may assistin reproducing the bug. However it's hard to know what informationneeds to be supplied in the bug report. In order to speed up theinformation retrieval process, we wrote the guidelines explainingexactly what information is expected. Usually, the better the bugreport is the sooner it's going to be reproduced and therefore fixed.
- Getting Help
- Solve your mod_perl problems: with the help of the mod_perl mailinglists, a mod_perl training company or a commercial supportcompany. Find an ISP providing mod_perl services.
- Mailing Lists
- mod_perl and related projects' mailing lists.
- There is a lot of software out there ready to run with mod_perl and/orhelp you with your programming project.
- How to contribute to the mod_perl community
- Got mod_perl?
- Advocacy documents and resources for mod_perl
- About mod_perl
- General information regarding mod_perl of historical interest.
- mod_perl subprojects
- Other projects maintained under the mod_perl umbrella
- Find the mod_perl job of your dreams!
- Site Map
- You can reach any document on this site from this sitemap.