2023年2月14日 19:07:18go评论93阅读模式

英文:

What is different the two font?

问题

当我在两个PDF文件中打印TextRenderInfo.getFont().getPostscriptFontName()时，会打印出AAAAAD+SourceHanSansCN-Normal和BIISMY+SourceHanSansCN-Normal。

我知道SourceHanSansCN-Normal是字体名称-字体脚本的格式，但AAAAAD是什么？它似乎不像字体系列。

示例代码：

public class CheckPdfAllFontTest implements TextExtractionStrategy {

    public static final String SRC = "ownTestFile.pdf";

    @Override
    public String getResultantText() {
        return null;
    }

    @Override
    public void beginTextBlock() {

    }

    @Override
    public void renderText(TextRenderInfo textRenderInfo) {
        String x = textRenderInfo.getFont().getPostscriptFontName();
        String text = textRenderInfo.getText();
        System.out.println(text + "=====" + x);
    }

    @Override
    public void endTextBlock() {

    }

    @Override
    public void renderImage(ImageRenderInfo imageRenderInfo) {

    }

    public static void main(String[] args) throws IOException, DocumentException {
        new CheckPdfAllFontTest().parse(SRC);
    }

    public void parse(String filename) throws IOException, IOException {
        int pageNumber = 1;
        PdfReader reader = new PdfReader(filename);
        System.out.println(PdfTextExtractor.getTextFromPage(reader, pageNumber, new CheckPdfAllFontTest()));
        reader.close();
    }
}

Itext PDF版本：

<dependency>
    <groupId>com.itextpdf</groupId>
    <artifactId>itextpdf</artifactId>
    <version>5.5.8</version>
</dependency>

这两个PDF文件是从PowerPoint文件中导出的，分别采用了“嵌入字体”和“不嵌入字体”的设置。

"AAAAAD+SourceHanSansCN-Normal" 来自“嵌入字体”PDF文件。
"BIISMY+SourceHanSansCN-Normal" 来自“不嵌入字体”PDF文件。

我正在收集PDF中使用的字体，但我发现字体名称的格式中有一部分我不清楚。在“+”之前是什么意思？

英文:

When i print TextRenderInfo.getFont().getPostscriptFontName() in two pdf file, it will be printed AAAAAD+SourceHanSansCN-Normal and BIISMY+SourceHanSansCN-Normal.

I known SourceHanSansCN-Normal is format of FontName-FontScript, but what is AAAAAD ? Is not like font family.

Example Code:

public class CheckPdfAllFontTest implements TextExtractionStrategy {

    public static final String SRC = &quot;ownTestFile.pdf&quot;;

    @Override
    public String getResultantText() {
        return null;
    }

    @Override
    public void beginTextBlock() {

    }

    @Override
    public void renderText(TextRenderInfo textRenderInfo) {
        String x = textRenderInfo.getFont().getPostscriptFontName();
        String text = textRenderInfo.getText();
        System.out.println(text + &quot;=====&quot; + x);
    }

    @Override
    public void endTextBlock() {

    }

    @Override
    public void renderImage(ImageRenderInfo imageRenderInfo) {

    }

    public static void main(String[] args) throws IOException, DocumentException {
        new CheckPdfAllFontTest().parse(SRC);
    }

    public void parse(String filename) throws IOException, IOException {
        int pageNumber = 1;
        PdfReader reader = new PdfReader(filename);
        System.out.println(PdfTextExtractor.getTextFromPage(reader, pageNumber, new CheckPdfAllFontTest()));
        reader.close();
    }
}

Itext pdf version:

&lt;dependency&gt;
    &lt;groupId&gt;com.itextpdf&lt;/groupId&gt;
    &lt;artifactId&gt;itextpdf&lt;/artifactId&gt;
    &lt;version&gt;5.5.8&lt;/version&gt;
&lt;/dependency&gt;

The two pdf is exported with "embed font" and "not embed font" setting from a Power Point File.

"AAAAAD+SourceHanSansCN-Normal" from "embed font" pdf file.
"BIISMY+SourceHanSansCN-Normal" from "not embed font" pdf file.

I am collecting the fonts used in pdf, but I found that there are fonts in this format. I don’t know what is before the ‘+’. What is its definition?

答案1

得分: 1

根据PDF规范：

9.9.2 字体子集

PDF文档可以包括子集的PDF字体，其子类型为Type1，TrueType或OpenType。描述字体子集的字体和字体描述符与普通字体略有不同。这些差异允许PDF处理器识别字体子集并合并包含不同子集的相同字体的文档。（有关字体描述符的更多信息，请参见9.8，“字体描述符”。）

对于字体子集，字体的PostScript名称，即字体的BaseFont条目的值和字体描述符的FontName条目，应以标签开头，后跟加号(+)，然后是从创建子集的字体的PostScript名称。标签应包含完全由大写字母组成的六个字母；字母的选择是任意的，但同一PDF文件中相同字体的不同子集应具有不同的标签。字形名称**.notdef**应在字体子集中定义。

注意建议PDF处理器将多个子集字体视为完全独立的实体，即使它们似乎是从同一原始字体创建的。

示例 EOODIA+Poetica是Poetica®的子集的名称，它是一种Type 1字体。

(ISO 32000-2)

因此，AAAAAD+SourceHanSansCN-Normal和BIISMY+SourceHanSansCN-Normal很可能是相同源字体的不同子集。

英文:

According to the PDF specification:

>### 9.9.2 Font subsets
>
>PDF documents may include subsets of PDF fonts whose Subtype is Type1, TrueType or OpenType. The font and font descriptor that describe a font subset are slightly different from those of ordinary fonts. These differences allow a PDF processor to recognise font subsets and to merge documents containing different subsets of the same font. (For more information on font descriptors, see 9.8, "Font descriptors".)
>
>For a font subset, the PostScript name of the font, that is, the value of the font’s BaseFont entry and the font descriptor’s FontName entry, shall begin with a tag followed by a plus sign (+) followed by the PostScript name of the font from which the subset was created. The tag shall consist of exactly six uppercase letters; the choice of letters is arbitrary, but different subsets of the same font in the same PDF file shall have different tags. The glyph name .notdef shall be defined in the font subset.
>
>NOTE It is recommended that PDF processors treat multiple subset fonts as completely independent entities, even if they appear to have been created from the same original font.
>
>EXAMPLE EOODIA+Poetica is the name of a subset of Poetica®, a Type 1 font.

(ISO 32000-2)

Thus, AAAAAD+SourceHanSansCN-Normal and BIISMY+SourceHanSansCN-Normal most likely are different subsets of the same source font.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

两种字体有什么不同？

问题

答案1

9.9.2 字体子集

震动转换 JSON 规范

获取数组中单元格的所有相邻单元格，不引发越界异常。

Kotlin错误’未解决的引用’在尝试从Kotlin文件运行Java代码时出现

java多线程中的质数问题

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论