Configuring (Non-embedded) Asian Font Support
PDF documents that embed any Asian fonts they reference will convert correctly with no need for additional fonts. Embedding the Asian fonts is Adobe's "recommended method for creating PDF files that are universally viewable." However, due to size considerations, most PDF documents that reference Asian fonts do not embed them.
In order to convert such documents, pdftodjvu includes and is configured to use native Windows or (on Linux / Unix) third-party fonts.
On the Windows platform, pdftodjvu is configured to support the following Asian languages if the default fonts are installed: Japanese, Korean, Simplified Chinese and Traditional Chinese. These substitutions are adequate for most situations. However, advanced users may wish to install additional fonts and configure pdftodjvu to use them. The remainder of the topic describes how to configure this support.
On Unix /Linux systems, the publicly available fonts from the O'Reilly web site (ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/adobe) are used. These fonts are included as a free convenience and provide only a minimum level of support. Most Asian PDF files will convert adequately and most text will be extracted. However, the resulting font substitutions may not be suitable for some applications. To optimize conversion of such documents, consider the following options:
This will give a true rendering of the PDF document, but hyperlinks and bookmark information will be lost.
To configure Asian font support with Ghostscript, there are two steps:
Install the necessary Fonts Resources (CID or True Type fonts)
Install any CID fonts or True Type fonts you want to use. In some cases, the fonts you are seeking may be installed by other applications. Note the location of the font files. If you are installing new CID fonts, for pdftodjvu, a good place to put them is:
c:\Program Files\Lizardtech\PDFtoDjVu\resources\CIDFont\
This is the default directory which is searched for fonts referenced from within a PDF file.
Specify the appropriate font substitutions.
Ghostscript uses the "cidfmap" file directory to specify CIDFont resource substitutions. By default, this is installed as “C:\Program Files\LizardTech\PDFtoDjVu\gs8.11\lib\cidfmap.”
If you add new fonts or require more sophisticated font substitution, you will probably want to edit this file. See Ghostscript documentation (included in pdftodjvu distribution) for more information. Basic instructions follow.
To add new substitutions, follow the format below:
/”font
name in documents” <<”keys and values of fonts to be assigned”>> ;
/FileType
Data type of the font resource. /TrueType only.
/Path Path to
font resource. Specify by absolute path.
/SubfontID
Optional. Index number of the font resource that has multiple fonts in one font
file such as TTC format.
/CSI CIDSystemInfo. Specify the character type that is defined in Cmap and CID font specification as [(“ordering”) “supplement”].
The following line specifies a substitution for all font references to "HPGothicE" using the corresponding True Type font:
/HGPGothicE << /FileType /TrueType /Path (c:/windows/fonts/hgrge.ttc) /SubfontID 1 /CSI [(Japan1) 4] >> ;
The following lines specify substitutions for all fonts using the Adobe-Japan1 and the Adobe-Korea1 character sets (the CID fonts mentioned here are from the Oreilly FTP site and installed in the resources/CIDFont directory mentioned above):
/Adobe-Korea1 /Munhwa-Regular
/Adobe-Japan1 /WadaMaruGo-Regular
Entries in the cidfmap file are Post Script commands. Be careful with spelling and syntax. The line-end semicolon and the space before it are required.
The fonts you reference must be installed where you say they are. If they are not, then the corresponding characters will not be rendered in your DjVu® image.
Supported fonts are CID-keyed and True Type. Open Type fonts are not supported.
The path to True Type fonts must be fully qualified. It is not always 'c:/windows/fonts'.
Earlier versions of pdftodjvu required the --gsargs switch to explicitly locate the CID Fonts. This is still supported but neither required nor recommended.