2023年3月7日 17:28:26go评论123阅读模式

英文:

Transform a json file into a pdf

问题

你应该将一个JSON文件转换为PDF。我在创建一个表格时遇到了困难，这个表格可以自动换行过长的项目，而不会溢出到右侧。

以下是你提供的JSON代码和Python实现：

import json
from fpdf import FPDF
import pandas as pd
with open('input.json') as f:
    data = json.load(f)
pdf = FPDF()
pdf.add_page()
pdf.set_font('Arial', 'B', 16)
# 标题
pdf.cell(0, 10, '检查摘要', 0, 1)
pdf.set_font('Arial', '', 12)
df = pd.DataFrame(data['allIssues'])
df = df[['ruleId', 'description', 'help', 'impact', 'selector', 'summary', 'source']]
col_width = pdf.w / 2.2
row_height = pdf.font_size * 2
for issue in df.itertuples(index=False):
    data = [
        ['ruleId:', str(issue.ruleId)],
        ['description:', issue.description],
        ['help:', issue.help],
        ['impact:', issue.impact],
        ['selector:', issue.selector],
        ['summary:', issue.summary],
        ['source:', issue.source],
    ]
    # 绘制表格
    for row in data:
        pdf.multi_cell(col_width, row_height, str(row[0]), border=0)
        pdf.multi_cell(col_width, row_height, str(row[1]), border=0)
        pdf.ln(row_height)
    # 在表格之间绘制线
    pdf.line(10, pdf.get_y(), pdf.w - 10, pdf.get_y())
    pdf.ln(row_height)
pdf.output('output.pdf', 'F')

你的实现中似乎有一个问题，导致文本溢出。要解决这个问题，你可以使用MultiCell方法来处理文本的自动换行，而不是MultiCell方法。在MultiCell方法中，你可以设置一个最大宽度，以确保文本自动换行。

要实现你想要的效果，可以尝试使用以下代码替换你的循环部分：

# Draw table
for row in data:
    pdf.multi_cell(col_width, row_height, str(row[0]), border=0)
    pdf.multi_cell(col_width, row_height, str(row[1]), border=0)
    pdf.ln()

这应该允许文本自动换行并防止溢出到右侧。这样应该更接近你期望的效果。

希望这能帮助你实现你想要的PDF输出。

英文:

I should transform a json file into pdf. I'm having trouble creating a table that allows me to make items that are too long wrap automatically and not overflow to the right side.
I paste an example of the json code that I should transform into pdf and then my implementation in python (which unfortunately returns a bad result)

json code:

&quot;allIssues&quot;:[
      {
         &quot;ruleId&quot;:&quot;name&quot;,
         &quot;description&quot;:&quot;Description123&quot;,
         &quot;help&quot;:&quot;Description234&quot;,
         &quot;impact&quot;:&quot;critical&quot;,
         &quot;selector&quot;:[
            &quot;abc1234&quot;
         ],
         &quot;summary&quot;:&quot;long text&quot;,
         &quot;source&quot;:&quot;long text2&quot;,
      },
      {
       ...
      },
            ]

My python implementation:

import json
from fpdf import FPDF
import pandas as pd
with open(&#39;input.json&#39;) as f:
    data = json.load(f)    
pdf = FPDF()
pdf.add_page()
pdf.set_font(&#39;Arial&#39;, &#39;B&#39;, 16)
# Title
pdf.cell(0, 10, &#39;Inspection summary&#39;, 0, 1)
pdf.set_font(&#39;Arial&#39;, &#39;&#39;, 12)
df = pd.DataFrame(data[&#39;allIssues&#39;])
df = df[[&#39;ruleId&#39;, &#39;description&#39;, &#39;help&#39;, &#39;impact&#39;, &#39;selector&#39;, &#39;summary&#39;, &#39;source&#39;]]
col_width = pdf.w / 2.2
row_height = pdf.font_size * 2
   for issue in df.itertuples(index=False):
    data = [
        [&#39;ruleId:&#39;, str(issue.ruleId)],
        [&#39;description:&#39;, issue.description],
        [&#39;help:&#39;, issue.help],
        [&#39;impact:&#39;, issue.impact],
        [&#39;selector:&#39;, issue.selector],
        [&#39;summary:&#39;, issue.summary],
        [&#39;source:&#39;, issue.source],
    ]
    # Draw table
    for row in data:
        pdf.multi_cell(col_width, row_height, str(row[0]), border=0)
        pdf.multi_cell(col_width, row_height, str(row[1]), border=0)
        pdf.ln(row_height)
    # Draw line between tables
    pdf.line(10, pdf.get_y(), pdf.w - 10, pdf.get_y())
    pdf.ln(row_height)
pdf.output(&#39;output.pdf&#39;, &#39;F&#39;)

This is a screenshot of the output:

This is what I'm trying to achive

Can you give me a hand? is it feasible to create something nice?

答案1

得分: 1

以下是翻译好的内容：

"There seems to be a lot of steps in your code. You could simply loop over the columns of your transposed df and export each of them to html. Append all html tables to a root html element and export with pdfkit:

import json
import pandas as pd
import lxml.etree as et
import pdfkit
your_json = &quot;&quot;&quot;{&quot;url&quot;: &quot;https://www.abc123.com&quot;, &quot;extensionVersion&quot;: &quot;4.51.0&quot;, &quot;axeVersion&quot;: &quot;4.6.3&quot;, &quot;standard&quot;: &quot;WCAG 2.1 AA&quot;, &quot;testingStartDate&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;testingEndDate&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;bestPracticesEnabled&quot;: false, &quot;issueSummary&quot;: {&quot;critical&quot;: 2, &quot;moderate&quot;: 0, &quot;minor&quot;: 0, &quot;serious&quot;: 0, &quot;bestPractices&quot;: 0, &quot;needsReview&quot;: 0}, &quot;remainingTestingSummary&quot;: {&quot;run&quot;: false}, &quot;igtSummary&quot;: [], &quot;failedRules&quot;: [{&quot;name&quot;: &quot;button-name&quot;, &quot;count&quot;: 1, &quot;mode&quot;: &quot;automated&quot;}, {&quot;name&quot;: &quot;select-name&quot;, &quot;count&quot;: 1, &quot;mode&quot;: &quot;automated&quot;}], &quot;needsReview&quot;: [], &quot;allIssues&quot;: [{&quot;ruleId&quot;: &quot;button-name&quot;, &quot;description&quot;: &quot;Ensures buttons have discernible text&quot;, &quot;help&quot;: &quot;Buttons must have discernible text&quot;, &quot;helpUrl&quot;: &quot;https://www.abc123.com&quot;, &quot;impact&quot;: &quot;critical&quot;, &quot;needsReview&quot;: false, &quot;isManual&quot;: false, &quot;selector&quot;: [&quot;.livechat-button&quot;], &quot;summary&quot;: &quot;Fix any of the following:\\n  Element does not have inner text that is visible to screen readers\\n  aria-label attribute does not exist or is empty\\n  aria-labelledby attribute does not exist, references elements that do not exist or references elements that are empty\\n  Element has no title attribute\\n  Element&#39;s default semantics were not overridden with role=\\&quot;none\\&quot; or role=\\&quot;presentation\\&quot;&quot;, &quot;source&quot;: &quot;&lt;button class=\\&quot;livechat-button items-center bg-black shadow-liveChat rounded-full text-white p-2 h-12 transition-all opacity-0 pointer-events-none w-sp-48 opacity-0 pointer-events-none\\&quot;&gt;&quot;, &quot;tags&quot;: [&quot;cat.name-role-value&quot;, &quot;wcag2a&quot;, &quot;wcag412&quot;, &quot;section508&quot;, &quot;section508.22.a&quot;, &quot;ACT&quot;], &quot;igt&quot;: &quot;&quot;, &quot;shareURL&quot;: &quot;&quot;, &quot;createdAt&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;testUrl&quot;: &quot;&quot;, &quot;testPageTitle&quot;: &quot;ABC123&quot;, &quot;foundBy&quot;: &quot;ab@bc.com&quot;, &quot;axeVersion&quot;: &quot;4.6.3&quot;}, {&quot;ruleId&quot;: &quot;select-name&quot;, &quot;description&quot;: &quot;Ensures select element has an accessible name&quot;, &quot;help&quot;: &quot;Select element must have an accessible name&quot;, &quot;helpUrl&quot;: &quot;https://www.abc123.com&quot;, &quot;impact&quot;: &quot;critical&quot;, &quot;needsReview&quot;: false, &quot;isManual&quot;: false, &quot;selector&quot;: [&quot;#plp__sortSelected&quot;], &quot;summary&quot;: &quot;Fix any of the following:\\n  Form element does not have an implicit (wrapped) &lt;label&gt;\\n  Form element does not have an explicit &lt;label&gt;\\n  aria-label attribute does not exist or is empty\\n  aria-labelledby attribute does not exist, references elements that do not exist or references elements that are empty\\n  Element has no title attribute\\n  Element&#39;s default semantics were not overridden with role=\\&quot;none\\&quot; or role=\\&quot;presentation\\&quot;&quot;, &quot;source&quot;: &quot;&lt;select class=\\&quot;w-full absolute opacity-0 appearance-none text-value-small font-bold text-black uppercase cursor-pointer bg-transparent outline-0\\&quot; id=\\&quot;plp__sortSelected\\&quot;&gt;&quot;, &quot;tags&quot;: [&quot;cat.forms&quot;, &quot;wcag2a&quot;, &quot;wcag412&quot;, &quot;section508&quot;, &quot;section508.22.n&quot;, &quot;ACT&quot;], &quot;igt&quot;: &quot;&quot;, &quot;shareURL&quot;: &quot;&quot;, &quot;createdAt&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;testUrl&quot;: &quot;https://www.abc123.com&quot;, &quot;testPageTitle&quot;: &quot;ABC123&quot;, &quot;foundBy&quot;: &quot;ab@bc.com&quot;, &quot;axeVersion&quot;: &quot;4.6.3&quot;}]}&quot;&quot;&quot;"
请注意，以上是代码部分，不进行翻译。
<details>
<summary>英文:</summary>
There seems to be a lot of steps in your code. You could simply loop over the columns of your transposed df and export each of them to html. Append all html tables to a root html element and export with `pdfkit`:
```python
import json
import pandas as pd
import lxml.etree as et
import pdfkit
your_json = &quot;&quot;&quot;{&quot;url&quot;: &quot;https://www.abc123.com&quot;, &quot;extensionVersion&quot;: &quot;4.51.0&quot;, &quot;axeVersion&quot;: &quot;4.6.3&quot;, &quot;standard&quot;: &quot;WCAG 2.1 AA&quot;, &quot;testingStartDate&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;testingEndDate&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;bestPracticesEnabled&quot;: false, &quot;issueSummary&quot;: {&quot;critical&quot;: 2, &quot;moderate&quot;: 0, &quot;minor&quot;: 0, &quot;serious&quot;: 0, &quot;bestPractices&quot;: 0, &quot;needsReview&quot;: 0}, &quot;remainingTestingSummary&quot;: {&quot;run&quot;: false}, &quot;igtSummary&quot;: [], &quot;failedRules&quot;: [{&quot;name&quot;: &quot;button-name&quot;, &quot;count&quot;: 1, &quot;mode&quot;: &quot;automated&quot;}, {&quot;name&quot;: &quot;select-name&quot;, &quot;count&quot;: 1, &quot;mode&quot;: &quot;automated&quot;}], &quot;needsReview&quot;: [], &quot;allIssues&quot;: [{&quot;ruleId&quot;: &quot;button-name&quot;, &quot;description&quot;: &quot;Ensures buttons have discernible text&quot;, &quot;help&quot;: &quot;Buttons must have discernible text&quot;, &quot;helpUrl&quot;: &quot;https://www.abc123.com&quot;, &quot;impact&quot;: &quot;critical&quot;, &quot;needsReview&quot;: false, &quot;isManual&quot;: false, &quot;selector&quot;: [&quot;.livechat-button&quot;], &quot;summary&quot;: &quot;Fix any of the following:\\n  Element does not have inner text that is visible to screen readers\\n  aria-label attribute does not exist or is empty\\n  aria-labelledby attribute does not exist, references elements that do not exist or references elements that are empty\\n  Element has no title attribute\\n  Element&#39;s default semantics were not overridden with role=\\&quot;none\\&quot; or role=\\&quot;presentation\\&quot;&quot;, &quot;source&quot;: &quot;&lt;button class=\\&quot;livechat-button items-center bg-black shadow-liveChat rounded-full text-white p-2 h-12 transition-all opacity-0 pointer-events-none w-sp-48 opacity-0 pointer-events-none\\&quot;&gt;&quot;, &quot;tags&quot;: [&quot;cat.name-role-value&quot;, &quot;wcag2a&quot;, &quot;wcag412&quot;, &quot;section508&quot;, &quot;section508.22.a&quot;, &quot;ACT&quot;], &quot;igt&quot;: &quot;&quot;, &quot;shareURL&quot;: &quot;&quot;, &quot;createdAt&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;testUrl&quot;: &quot;&quot;, &quot;testPageTitle&quot;: &quot;ABC123&quot;, &quot;foundBy&quot;: &quot;ab@bc.com&quot;, &quot;axeVersion&quot;: &quot;4.6.3&quot;}, {&quot;ruleId&quot;: &quot;select-name&quot;, &quot;description&quot;: &quot;Ensures select element has an accessible name&quot;, &quot;help&quot;: &quot;Select element must have an accessible name&quot;, &quot;helpUrl&quot;: &quot;https://www.abc123.com&quot;, &quot;impact&quot;: &quot;critical&quot;, &quot;needsReview&quot;: false, &quot;isManual&quot;: false, &quot;selector&quot;: [&quot;#plp__sortSelected&quot;], &quot;summary&quot;: &quot;Fix any of the following:\\n  Form element does not have an implicit (wrapped) &lt;label&gt;\\n  Form element does not have an explicit &lt;label&gt;\\n  aria-label attribute does not exist or is empty\\n  aria-labelledby attribute does not exist, references elements that do not exist or references elements that are empty\\n  Element has no title attribute\\n  Element&#39;s default semantics were not overridden with role=\\&quot;none\\&quot; or role=\\&quot;presentation\\&quot;&quot;, &quot;source&quot;: &quot;&lt;select class=\\&quot;w-full absolute opacity-0 appearance-none text-value-small font-bold text-black uppercase cursor-pointer bg-transparent outline-0\\&quot; id=\\&quot;plp__sortSelected\\&quot;&gt;&quot;, &quot;tags&quot;: [&quot;cat.forms&quot;, &quot;wcag2a&quot;, &quot;wcag412&quot;, &quot;section508&quot;, &quot;section508.22.n&quot;, &quot;ACT&quot;], &quot;igt&quot;: &quot;&quot;, &quot;shareURL&quot;: &quot;&quot;, &quot;createdAt&quot;: &quot;2023-04-03T09:35:06.177Z&quot;, &quot;testUrl&quot;: &quot;https://www.abc123.com&quot;, &quot;testPageTitle&quot;: &quot;ABC123&quot;, &quot;foundBy&quot;: &quot;ab@bc.com&quot;, &quot;axeVersion&quot;: &quot;4.6.3&quot;}]}&quot;&quot;&quot;
data = json.loads(your_json)
## replace the above lines with the following in your case
# with open(&#39;your_file.json&#39;, &#39;r&#39;) as f:   
#     data = json.load(f)
html = et.Element(&quot;html&quot;)
# general info
html.append(et.fromstring(f&quot;&quot;&quot;&lt;h3&gt;Site link: &lt;a href=&quot;{data[&#39;url&#39;]}&quot;&gt;{data[&#39;url&#39;]}&lt;/a&gt;&lt;/h3&gt;&quot;&quot;&quot;))
html.append(et.fromstring(f&quot;&quot;&quot;&lt;h4&gt;Date: {data[&#39;testingEndDate&#39;]}&lt;/h4&gt;&quot;&quot;&quot;))
html.append(et.fromstring(f&quot;&quot;&quot;&lt;h4&gt;Summary:&lt;/h4&gt;&quot;&quot;&quot;))
# summary table
summary = pd.Series(data[&#39;issueSummary&#39;])
summary_table = et.fromstring(summary.to_frame().to_html(header=False))
summary_table.set(&#39;class&#39;, &#39;summary&#39;)
html.append(summary_table)
# issue tables
cols_of_interest = [&#39;ruleId&#39;, &#39;description&#39;, &#39;help&#39;, &#39;impact&#39;, &#39;selector&#39;, &#39;summary&#39;, &#39;source&#39;]
df = pd.DataFrame(data[&#39;allIssues&#39;])[cols_of_interest].T
for col in df.columns:
    table = et.fromstring(df[[col]].to_html(header=False))
    table.set(&#39;class&#39;, &#39;issue&#39;)
    html.append(table)
    html.append(et.fromstring(&#39;&lt;br/&gt;&#39;))
pdfkit.from_string(et.tostring(html, encoding=&quot;unicode&quot;), &quot;./output.pdf&quot;, css=&#39;style.css&#39;)

With the following css file:

/* style.css */
* {
    font-family: &#39;Liberation Sans&#39;;
}
table {
    margin: 20px;
    margin-left: auto;
    margin-right: auto;
}
table.summary {
    width: 50%;
}
table.issue{
    border: 0;
    width: 100%;
    border-collapse: collapse;
  }
  
table.issue td,
table.issue th {
    border: 0;
    text-align: left;
    padding: 5px;
}
table.issue tr {
border-bottom: 1px solid #dddddd;
}

You'll get:

Edit: updated json with the data you provided + exporting additional data + improved css

Note: you will need to install wkhtmltopdf and make sure that it is in your path.

Edit2: limiting output to desired fields

答案2

得分: 0

免责声明: 我是这个答案中使用的 borb 库的作者。

假设您的数据如下所示：

data = [
      {
         "ruleId":"name",
         "description":"Description123",
         "help":"Description234",
         "impact":"critical",
         "selector":[
            "abc1234"
         ],
         "summary":"long text",
         "source":"long text2",
      },
]

您可以运行以下代码：

from borb.pdf import Document, Page, PageLayout, SingleColumnLayout, Paragraph, HexColor, Table, TableUtil
from decimal import Decimal
# 创建一个空文档
doc: Document = Document()
# 创建一个空页面
page: Page = Page()
doc.add_page(page)
# 使用 PageLayout 来方便地添加内容
layout: PageLayout = SingleColumnLayout(page)
# 为每个问题生成一个表格
for i, issue in enumerate(data):
  # 添加标题 (段落)
  layout.add(Paragraph("Issue %d" % i, font_size=Decimal(20), font_color=HexColor("#B5F8FE")))
  # 添加表格 (使用方便的 TableUtil 类)
  table: Table = TableUtil.from_2d_array([["Rule ID", issue.get("ruleId", "N.A.")],
                                          ["Description", issue.get("description", "N.A.")],
                                          ["Help", issue.get("help", "N.A.")],
                                          ["Impact", issue.get("impact", "N.A.")],
                                          ["Selector", str(issue.get("selector", []))],
                                          ["Summary", issue.get("summary", "N.A.")],
                                          ["Source", issue.get("source", "N.A.")],
                                          ], header_row=False, header_col=True, flexible_column_width=False)
  layout.add(table)
# 存储为 PDF
with open("output.pdf", "wb") as fh:
  PDF.dumps(fh, doc)

这将生成以下 PDF：

英文:

disclaimer: I am the author of borb, the library used in this answer.

Assuming your data looks like this:

data = [
      {
         &quot;ruleId&quot;:&quot;name&quot;,
         &quot;description&quot;:&quot;Description123&quot;,
         &quot;help&quot;:&quot;Description234&quot;,
         &quot;impact&quot;:&quot;critical&quot;,
         &quot;selector&quot;:[
            &quot;abc1234&quot;
         ],
         &quot;summary&quot;:&quot;long text&quot;,
         &quot;source&quot;:&quot;long text2&quot;,
      },
]

You can run the following code:

from borb.pdf import Document, Page, PageLayout, SingleColumnLayout, Paragraph, HexColor, Table, TableUtil
from decimal import Decimal
# create empty document
doc: Document = Document()
# create empty page
page: Page = Page()
doc.add_page(page)
# use a PageLayout to be able to add things easily
layout: PageLayout = SingleColumnLayout(page)
# generate a Table for each issue
for i, issue in enumerate(data):
  # add a header (Paragraph)
  layout.add(Paragraph(&quot;Issue %d&quot; % i, font_size=Decimal(20), font_color=HexColor(&quot;#B5F8FE&quot;)))
  # add a Table (using the convenient TableUtil class)
  table: Table = TableUtil.from_2d_array([[&quot;Rule ID&quot;, issue.get(&quot;ruleId&quot;, &quot;N.A.&quot;)],
                                          [&quot;Description&quot;, issue.get(&quot;description&quot;, &quot;N.A.&quot;)],
                                          [&quot;Help&quot;, issue.get(&quot;help&quot;, &quot;N.A.&quot;)],
                                          [&quot;Impact&quot;, issue.get(&quot;impact&quot;, &quot;N.A.&quot;)],
                                          [&quot;Selector&quot;, str(issue.get(&quot;selector&quot;, []))],
                                          [&quot;Summary&quot;, issue.get(&quot;summary&quot;, &quot;N.A.&quot;)],
                                          [&quot;Source&quot;, issue.get(&quot;source&quot;, &quot;N.A.&quot;)],
                                          ], header_row=False, header_col=True, flexible_column_width=False)
  layout.add(table)
# store the PDF
with open(&quot;output.pdf&quot;, &quot;wb&quot;) as fh:
  PDF.dumps(fh, doc)

This generates the following PDF:

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将一个JSON文件转换成PDF。

问题

答案1

答案2

冲突 pyinstaller vs pathlib –> 移除 pathlib 导致删除 anaconda 导航器

Pandas根据条件和分组，递增数据框中的每第n行。

Assign Json to a string without serilization in c#

How to use golang to convert a multi-line json to one-line json?

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。