英文:
Getting ###### instead of Emojis When Generating PDF using Apache FOP 2.4
问题
我理解你的请求,以下是代码部分的翻译:
-
我正在Web应用程序中生成PDF,该应用程序使用Apache FOP 2.4。在生成带有表情符号的PDF时,PDF中出现了"###############"。但是正常的字母数字文本显示正常。
-
下面是我的fop.xconf文件中的渲染器配置:
<renderers>
<renderer mime="application/pdf">
<filterList>
<value>flate</value>
</filterList>
<fonts>
<font embed-url="ARIAL.TTF">
<font-triplet name="Arial" style="normal" weight="normal"/>
</font>
<!-- 其他字体配置... -->
<font embed-url="NotoColorEmoji-Regular.ttf">
<font-triplet name="Noto Color Emoji" style="normal" weight="normal"/>
</font>
</fonts>
</renderer>
</renderers>
-
我已经将"NotoColorEmoji-Regular.ttf"添加到支持表情符号。但是我仍然无法正确显示PDF中的表情符号。
-
我将所有字体文件添加到与fop.xconf文件相同的级别。如何解决这个表情符号问题?
-
我在2023年11月4日更新了问题,一旦我将"Noto Color Emoji"添加到XSL文件的font-family属性中,"###############"字符停止显示,而现在显示为空白字符。
-
为了进一步测试这个问题,我创建了一个使用Apache FOP生成PDF文件的示例应用程序。你可以在GitHub项目中找到项目。
以上是代码部分的翻译,不包括问题或附加信息。
英文:
I'm generating PDF in a web application which is using Apache FOP 2.4. When generating PDF with Emojis it is resulting in the PDF as ###############. But normal alphanumeric text is appearing properly.
Following is how my renderers looks like in fop.xconf file.
<renderers>
<renderer mime="application/pdf">
<filterList>
<value>flate</value>
</filterList>
<fonts>
<font embed-url="ARIAL.TTF">
<font-triplet name="Arial" style="normal" weight="normal"/>
</font>
<font embed-url="ARIALBD.TTF">
<font-triplet name="Arial" style="normal" weight="bold"/>
</font>
<font embed-url="ARIALI.TTF">
<font-triplet name="Arial" style="italic" weight="normal"/>
</font>
<font embed-url="ARIALBI.TTF">
<font-triplet name="Arial" style="italic" weight="bold"/>
</font>
<font embed-url="ARIALUNI.TTF">
<font-triplet name="Arial Unicode MS" style="normal" weight="normal"/>
</font>
<font embed-url="NotoColorEmoji-Regular.ttf">
<font-triplet name="Noto Color Emoji" style="normal" weight="normal"/>
</font>
<font embed-url="MSGOTHIC.TTC" metrics-url="msgothic.xml">
<font-triplet name="Gothic" style="normal" weight="normal"/>
<font-triplet name="Gothic" style="normal" weight="bold"/>
<font-triplet name="Gothic" style="italic" weight="normal"/>
<font-triplet name="Gothic" style="italic" weight="bold"/>
</font>
</fonts>
</renderer>
</renderers>
I added the NotoColorEmoji-Regular.ttf to support emojis. But I'm still not getting the emojis in the PDF properly.
I have added all the font files at the same level as the fop.xconf file.
How can I resolve this emoji issue?
[Update as at 11/04/2023]
Once I added the "Noto Color Emoji" into the font-family attribute in the XSL file, it stopped showing ####### symbols. Instead now it's giving empty white symbols.
It order to test this further I created a sample application to generate PDF files using Apache FOP. You can find the project here <https://github.com/sachindragh/pdf-testing>.
Following are the java code, xsl file, input.xml and the fop.xonf file contents in it.
public static void main(String[] args) throws SAXException, TransformerException, IOException {
FopFactory fopFactory = FopFactory.newInstance(new File("./fop.xconf"));
OutputStream out = new BufferedOutputStream(new FileOutputStream(new File("fop2_8output.pdf")));
try {
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, out);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(new File("stylesheet.xsl")));
Source src = new StreamSource(new File("input.xml"));
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(src, res);
System.out.println("PDF file generated successfully.");
} finally {
out.close();
}
}
stylesheet.xsl
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<!-- Vladimir Script VLADIMIR.TTF -->
<!-- Viner Hand ITC VINERITC.TTF -->
<!-- Noto Color Emoji NotoColorEmoji-Regular.ttf -->
<fo:root font-family="Noto Color Emoji,Viner Hand ITC">
<fo:layout-master-set>
<fo:simple-page-master master-name="A4-portrait"
page-height="29.7cm" page-width="21.0cm" margin="2cm">
<fo:region-body/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="A4-portrait">
<fo:flow flow-name="xsl-region-body">
<fo:block font-family="Noto Color Emoji,Viner Hand ITC">
<xsl:value-of select="value"/>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
</xsl:stylesheet>
input.xml
<?xml version="1.0" encoding="UTF-8"?>
<value>
This is a sample text Emojis
BMP Glyphs &#x2663; &#x2705;
Non-BMP Glyphs &#x1F410; &#x1F600;
</value>
fop.xconf
<?xml version="1.0"?>
<fop version="1.0">
<renderers>
<renderer mime="application/pdf">
<filterList>
<value>flate</value>
</filterList>
<fonts>
<font embed-url="NotoColorEmoji-Regular.ttf">
<font-triplet name="Noto Color Emoji" style="normal" weight="normal"/>
</font>
<font embed-url="VINERITC.TTF">
<font-triplet name="Viner Hand ITC" style="normal" weight="normal"/>
</font>
<!--<font embed-url="VLADIMIR.TTF">
<font-triplet name="Vladimir Script" style="normal" weight="normal"/>
</font>-->
</fonts>
<version>1.7</version>
</renderer>
</renderers>
</fop>
I ran the above example for Apache FOP library version 2.4 and 2.8 with PDF version set to 1.4 and 1.7 in each library. In the input.xml I added both BMP glyphs(U+2663, U+2705) and non-BMP glyphs(U+1F410, U+1F600). But I get the same result with empty white symbols for both BMP glyphs and non-BMP glyphs. Following is what pdf output looks like.
When I do a select all I can see some characters getting selected in those empty spaces.
And when I copy and paste the selected text into notepad some of the characters appear as follows.
I also looked into the FOP issue <https://issues.apache.org/jira/browse/FOP-1969> mentioned by @Kevin Brown. It seems a fix for the issue is already merged in 2.3 release.
Can anyone help me to figure this issue or any options that I can try to solve this?
答案1
得分: 2
I do not believe that would be supported in FOP. Most all the characters in that font are in the Supplementary Multilingual Plane. From FOP documentation:
"Support for Unicode characters outside of the Base Multilingual Plane (BMP), i.e., characters whose code points are greater than 65535, is not yet implemented. See issue FOP-1969."
For instance:
Unicode Character “🐐” (U+1F410) = Unicode 1F410 = 128016
If you installed that font in a Windows machine and use charmap to view it, you will find only maybe 35 characters. Searching for one shows no character.
英文:
I do not believe that would be supported in FOP. Most all the characters in that font are in the Supplementary Multilingual Plane. From FOP documentation:
"Support for Unicode characters outside of the Base Multilingual Plane (BMP), i.e., characters whose code points are greater than 65535, is not yet implemented. See issue FOP-1969."
For instance:
Unicode Character “🐐” (U+1F410) = Unicode 1F410 = 128016
If you installed that font in a Windows machine and use charmap to view it, you will find only maybe 35 characters. Searching for one shows no character:
If you tried to view the entire font, you only see these:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论