Markup
Markup a PDF with hyperlinks, bookmark / outline, page numbers, and so on.
- Scan for URLs and make into hyperlinks.
- Scan for page references ("see page 5") and make into hyperlinks.
- Scan for section headers and make into a hierarchical outline/bookrmarks.
- Add page numbers
Markup can be specified by a file or with patterns.
For a PDF that is long and has complicated structure, try the -analyze
to automatically collect most of the data into a file, then manually tune it, and finally
use -apply to fuse the data to the PDF and make it active.
Options
java tool.pdf.Markup [options] PDF-file(s)
- Scanning
- -url --
Scan for URLs and make into hyperlinks.
- -pageref --
Scan for table of contents and page and section references ("page 5", "see Appendix A") and make into local hyperlinks.
skip over troublesome parts, like table of contents
- -analyze
or -nowrite --
scan only, for the types selected in other options above.
Reporting discoveries for use in a database,
or to be tuned and applied.
- Additional markup (no scanning required)
- -pagenum start -- add page numbers, starting with number start.
Can be used to Apply continuous page numbering across a set of PDFs or a merged PDF.
- Writing
- -apply file --
take a file in the format produced by -analyze,
perhaps manually tuned or generated by another program,
and apply the markup operations.
If -apply is in effect, no scanning is done.
- -overwrite --
overwrite an existing outline / bookmark set
- -page range --
only scan/apply in given page range.
This allows different rules for different parts of a document. For example,
in an index, perhaps every number should be treated as a page reference,
whereas in the body of the text, numbers should be considered pages
only if part of a pattern like "see page n".
- -verbose --
display text that matches some pattern as encountered
Examples
- One record per line.
- Comment lines start with number sign (
#
).
- Record types
url bbox url
pageref bbox page-number
head1 page-number label
head2 page-number label
head3 page-number label
head4 page-number label
Example
Often it's easiest to first run with -analyze,
capture the output in a file, edit,
and then apply the markup to the PDF with -apply file.
If somehow have similar,
just run through some text processing software, like Perl,
to massage the file into the exact format.