2023年7月10日 13:25:16go评论75阅读模式

英文:

Find all the texts which is 'Normal' style and font size is NOT 11 in a docx file using python-docx

问题

以下是您的代码的翻译部分：

从docx.api导入Document
导入pandas as pd
从docx.shared导入Pt

texts = []
sizes = []
document = Document('new.docx')
for p in document.paragraphs:
    for run in p.runs:
        if p.style.name.startswith("Normal") and run.font.size != Pt(11):
            texts.append(run.text)
print(texts)

请注意，此代码段的输出看起来似乎正确，但某些输出是不正确的。不正确的意思是我还获取了正常样式且字体大小为11的输出。这是否是正确的实现，还是有其他方法可以实现此目标？谢谢！

英文:

My implementation so far:

from docx.api import Document
import pandas as pd
from docx.shared import Pt

texts = []
sizes = []
document = Document(&#39;new.docx&#39;)
for p in document.paragraphs:
    for run in p.runs:
        if p.style.name.startswith(&quot;Normal&quot;) and run.font.size != Pt(11):
            texts.append(run.text)
print(texts)

This seems to give the output but some outputs are incorrect. By incorrect I mean I am also getting output which is Normal style and font size is 11. Is this the correct implementation or is there any other way to achieve this? TIA!

答案1

得分: 2

我学到的是，默认情况下，样式存储在.docx文件的另一部分。只有在某种条件下，才能提取样式设置。如果该设置与段落应用的默认样式设置（例如Normal、No Spacing、Heading 1、Title等）不同，Word会将其与文本一起存储。

另一个StackOverflow问题帖子，以更好地理解：
链接

例如，如果您的Word中“Heading 1”的默认字体大小为20pt，而您的文本也是20pt，则无法提取它。但如果它是其他值，它将由您的代码返回。

英文:

Explanation

What I learned is that styles are stored in another part of the .docx files by default. A style setting can be extracted in one condition. If that setting differs from the default style settings (e.g., Normal, No Spacing, Heading 1, Title, etc.) applied to the paragraph. In this case, Word stores it with the text.

Another StackOverflow question thread for a better understanding:
link

Example

E.g., If your Word's default font size for the "Heading 1" is 20pt, and your text is 20pt, you won't be able to extract it. But if it is something else, it will return by your code.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Find all the texts which is 'Normal' style and font size is NOT 11 in a docx file using python-docx

问题

答案1

Explanation

Example

请求的数组在将列表转换为NumPy数组后，在1个维度上具有不均匀的形状。

在Python中将值以特定格式追加到列表

检查 tkinter 窗口的大小使用 “if” 语句。

在Python中，即使程序关闭，也可以将数据存储在文件中。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论