生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

huangapple go评论77阅读模式
英文:

Getting ###### instead of Emojis When Generating PDF using Apache FOP 2.4

问题

我理解你的请求,以下是代码部分的翻译:

  • 我正在Web应用程序中生成PDF,该应用程序使用Apache FOP 2.4。在生成带有表情符号的PDF时,PDF中出现了"###############"。但是正常的字母数字文本显示正常。

  • 下面是我的fop.xconf文件中的渲染器配置:

<renderers>
	<renderer mime="application/pdf">
		<filterList>
			<value>flate</value>
		</filterList>
		<fonts>
			<font embed-url="ARIAL.TTF">
				<font-triplet name="Arial" style="normal" weight="normal"/>
			</font>
			<!-- 其他字体配置... -->
			<font embed-url="NotoColorEmoji-Regular.ttf">
				<font-triplet name="Noto Color Emoji" style="normal" weight="normal"/>
			</font>
		</fonts>
	</renderer>
</renderers>
  • 我已经将"NotoColorEmoji-Regular.ttf"添加到支持表情符号。但是我仍然无法正确显示PDF中的表情符号。

  • 我将所有字体文件添加到与fop.xconf文件相同的级别。如何解决这个表情符号问题?

  • 我在2023年11月4日更新了问题,一旦我将"Noto Color Emoji"添加到XSL文件的font-family属性中,"###############"字符停止显示,而现在显示为空白字符。

  • 为了进一步测试这个问题,我创建了一个使用Apache FOP生成PDF文件的示例应用程序。你可以在GitHub项目中找到项目。

以上是代码部分的翻译,不包括问题或附加信息。

英文:

I'm generating PDF in a web application which is using Apache FOP 2.4. When generating PDF with Emojis it is resulting in the PDF as ###############. But normal alphanumeric text is appearing properly.
Following is how my renderers looks like in fop.xconf file.

&lt;renderers&gt;
	&lt;renderer mime=&quot;application/pdf&quot;&gt;
		&lt;filterList&gt;
			&lt;value&gt;flate&lt;/value&gt;
		&lt;/filterList&gt;
		&lt;fonts&gt;
			&lt;font embed-url=&quot;ARIAL.TTF&quot;&gt;
				&lt;font-triplet name=&quot;Arial&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
			&lt;/font&gt;
			&lt;font embed-url=&quot;ARIALBD.TTF&quot;&gt;
				&lt;font-triplet name=&quot;Arial&quot; style=&quot;normal&quot; weight=&quot;bold&quot;/&gt;
			&lt;/font&gt;
			&lt;font embed-url=&quot;ARIALI.TTF&quot;&gt;
				&lt;font-triplet name=&quot;Arial&quot; style=&quot;italic&quot; weight=&quot;normal&quot;/&gt;
			&lt;/font&gt;
			&lt;font embed-url=&quot;ARIALBI.TTF&quot;&gt;
				&lt;font-triplet name=&quot;Arial&quot; style=&quot;italic&quot; weight=&quot;bold&quot;/&gt;
			&lt;/font&gt;
			&lt;font embed-url=&quot;ARIALUNI.TTF&quot;&gt;
				&lt;font-triplet name=&quot;Arial Unicode MS&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
			&lt;/font&gt;
			&lt;font embed-url=&quot;NotoColorEmoji-Regular.ttf&quot;&gt;
				&lt;font-triplet name=&quot;Noto Color Emoji&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
			&lt;/font&gt;
			&lt;font embed-url=&quot;MSGOTHIC.TTC&quot; metrics-url=&quot;msgothic.xml&quot;&gt;
				&lt;font-triplet name=&quot;Gothic&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
				&lt;font-triplet name=&quot;Gothic&quot; style=&quot;normal&quot; weight=&quot;bold&quot;/&gt;
				&lt;font-triplet name=&quot;Gothic&quot; style=&quot;italic&quot; weight=&quot;normal&quot;/&gt;
				&lt;font-triplet name=&quot;Gothic&quot; style=&quot;italic&quot; weight=&quot;bold&quot;/&gt;
			&lt;/font&gt;
		&lt;/fonts&gt;
	&lt;/renderer&gt;
&lt;/renderers&gt;

I added the NotoColorEmoji-Regular.ttf to support emojis. But I'm still not getting the emojis in the PDF properly.
I have added all the font files at the same level as the fop.xconf file.
How can I resolve this emoji issue?

[Update as at 11/04/2023]

Once I added the "Noto Color Emoji" into the font-family attribute in the XSL file, it stopped showing ####### symbols. Instead now it's giving empty white symbols.

It order to test this further I created a sample application to generate PDF files using Apache FOP. You can find the project here <https://github.com/sachindragh/pdf-testing>.

Following are the java code, xsl file, input.xml and the fop.xonf file contents in it.

public static void main(String[] args) throws SAXException, TransformerException, IOException {
    FopFactory fopFactory = FopFactory.newInstance(new File(&quot;./fop.xconf&quot;));
    OutputStream out = new BufferedOutputStream(new FileOutputStream(new File(&quot;fop2_8output.pdf&quot;)));
    try {
        Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, out);
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer(new StreamSource(new File(&quot;stylesheet.xsl&quot;)));
        Source src = new StreamSource(new File(&quot;input.xml&quot;));
        Result res = new SAXResult(fop.getDefaultHandler());
        transformer.transform(src, res);
        System.out.println(&quot;PDF file generated successfully.&quot;);
    } finally {
        out.close();
    }
}

stylesheet.xsl

&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
&lt;xsl:stylesheet version=&quot;1.0&quot;
                xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;
                xmlns:fo=&quot;http://www.w3.org/1999/XSL/Format&quot;&gt;
    &lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot;/&gt;
    &lt;xsl:template match=&quot;/&quot;&gt;
        &lt;!-- Vladimir Script    VLADIMIR.TTF --&gt;
        &lt;!-- Viner Hand ITC     VINERITC.TTF --&gt;
        &lt;!-- Noto Color Emoji   NotoColorEmoji-Regular.ttf --&gt;
        &lt;fo:root font-family=&quot;Noto Color Emoji,Viner Hand ITC&quot;&gt;
            &lt;fo:layout-master-set&gt;
                &lt;fo:simple-page-master master-name=&quot;A4-portrait&quot;
                                       page-height=&quot;29.7cm&quot; page-width=&quot;21.0cm&quot; margin=&quot;2cm&quot;&gt;
                    &lt;fo:region-body/&gt;
                &lt;/fo:simple-page-master&gt;
            &lt;/fo:layout-master-set&gt;
            &lt;fo:page-sequence master-reference=&quot;A4-portrait&quot;&gt;
                &lt;fo:flow flow-name=&quot;xsl-region-body&quot;&gt;
                    &lt;fo:block font-family=&quot;Noto Color Emoji,Viner Hand ITC&quot;&gt;
                        &lt;xsl:value-of select=&quot;value&quot;/&gt;
                    &lt;/fo:block&gt;
                &lt;/fo:flow&gt;
            &lt;/fo:page-sequence&gt;
        &lt;/fo:root&gt;
    &lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;

input.xml

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;value&gt;
    This is a sample text Emojis
    BMP Glyphs  &amp;#x2663; &amp;#x2705;
    Non-BMP Glyphs  &amp;#x1F410; &amp;#x1F600;
&lt;/value&gt;

fop.xconf

&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;fop version=&quot;1.0&quot;&gt;
  &lt;renderers&gt;
    &lt;renderer mime=&quot;application/pdf&quot;&gt;
      &lt;filterList&gt;
        &lt;value&gt;flate&lt;/value&gt;
      &lt;/filterList&gt;
      &lt;fonts&gt;
        &lt;font embed-url=&quot;NotoColorEmoji-Regular.ttf&quot;&gt;
            &lt;font-triplet name=&quot;Noto Color Emoji&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
        &lt;/font&gt;
        &lt;font embed-url=&quot;VINERITC.TTF&quot;&gt;
          &lt;font-triplet name=&quot;Viner Hand ITC&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
        &lt;/font&gt;
        &lt;!--&lt;font embed-url=&quot;VLADIMIR.TTF&quot;&gt;
          &lt;font-triplet name=&quot;Vladimir Script&quot; style=&quot;normal&quot; weight=&quot;normal&quot;/&gt;
        &lt;/font&gt;--&gt;
      &lt;/fonts&gt;
      &lt;version&gt;1.7&lt;/version&gt;
    &lt;/renderer&gt;
  &lt;/renderers&gt;
&lt;/fop&gt;

I ran the above example for Apache FOP library version 2.4 and 2.8 with PDF version set to 1.4 and 1.7 in each library. In the input.xml I added both BMP glyphs(U+2663, U+2705) and non-BMP glyphs(U+1F410, U+1F600). But I get the same result with empty white symbols for both BMP glyphs and non-BMP glyphs. Following is what pdf output looks like.

生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

When I do a select all I can see some characters getting selected in those empty spaces.

生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

And when I copy and paste the selected text into notepad some of the characters appear as follows.

生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

I also looked into the FOP issue <https://issues.apache.org/jira/browse/FOP-1969> mentioned by @Kevin Brown. It seems a fix for the issue is already merged in 2.3 release.

生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

Can anyone help me to figure this issue or any options that I can try to solve this?

答案1

得分: 2

I do not believe that would be supported in FOP. Most all the characters in that font are in the Supplementary Multilingual Plane. From FOP documentation:

"Support for Unicode characters outside of the Base Multilingual Plane (BMP), i.e., characters whose code points are greater than 65535, is not yet implemented. See issue FOP-1969."

For instance:

Unicode Character “🐐” (U+1F410) = Unicode 1F410 = 128016

If you installed that font in a Windows machine and use charmap to view it, you will find only maybe 35 characters. Searching for one shows no character.

英文:

I do not believe that would be supported in FOP. Most all the characters in that font are in the Supplementary Multilingual Plane. From FOP documentation:

"Support for Unicode characters outside of the Base Multilingual Plane (BMP), i.e., characters whose code points are greater than 65535, is not yet implemented. See issue FOP-1969."

For instance:

Unicode Character “🐐” (U+1F410) = Unicode 1F410 = 128016

If you installed that font in a Windows machine and use charmap to view it, you will find only maybe 35 characters. Searching for one shows no character:

生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

If you tried to view the entire font, you only see these:

生成PDF使用Apache FOP 2.4时,获得了######而不是表情符号。

huangapple
  • 本文由 发表于 2023年4月4日 04:19:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75923463.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定