将 Pandas 中的索引标题行和列标题行组合起来。

huangapple go评论138阅读模式
英文:

Combine index header row and column header row in Pandas

问题

I create a dataframe and export to an html table. However the headers are off as below

How can I combine the index name row, and the column name row?

I want the table header to look like this:

将 Pandas 中的索引标题行和列标题行组合起来。

but it currently exports to html like this:

将 Pandas 中的索引标题行和列标题行组合起来。

I create the dataframe as below (example):

data = [{'Name': 'A', 'status': 'ok', 'host': '1', 'time1': '2020-01-06 06:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'A', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 06:31:06', 'time2': '-'}, {'Name': 'B', 'status': 'Alert', 'host': '1', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'B', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'B', 'status': 'ok', 'host': '4', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'Alert', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '3', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '4', 'time1': '-', 'time2': '-'}]

df = pandas.DataFrame(data)
df.set_index(['Name', 'status', 'host'], inplace=True)
html_body = df.to_html(bold_rows=False)

The index is set to have hierarchical rows, for easier reading in an html table:

print(df)

                           time1                time2
Name status host                                          
A    ok     1     2020-01-06 06:31:06  2020-02-06 21:10:00
        2     2020-01-06 06:31:06                    -
B    Alert  1     2020-01-06 10:31:06  2020-02-06 21:10:00
     ok     2     2020-01-06 10:31:06  2020-02-06 21:10:00
        4     2020-01-06 10:31:06  2020-02-06 21:10:00
C    Alert  2     2020-01-06 10:31:06  2020-02-06 21:10:00
     ok     3     2020-01-06 10:31:06  2020-02-06 21:10:00
        4                       -                    -

The only solution that I've got working is to set every column to index.
This doesn't seem practical tho, and leaves an empty row that must be manually removed:

将 Pandas 中的索引标题行和列标题行组合起来。

英文:

I create a dataframe and export to an html table. However the headers are off as below

How can I combine the index name row, and the column name row?

I want the table header to look like this:

将 Pandas 中的索引标题行和列标题行组合起来。

but it currently exports to html like this:

将 Pandas 中的索引标题行和列标题行组合起来。

I create the dataframe as below (example):

data = [{'Name': 'A', 'status': 'ok', 'host': '1', 'time1': '2020-01-06 06:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'A', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 06:31:06', 'time2': '-'}, {'Name': 'B', 'status': 'Alert', 'host': '1', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'B', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'B', 'status': 'ok', 'host': '4', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'Alert', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '3', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '4', 'time1': '-', 'time2': '-'}]

df = pandas.DataFrame(data)
df.set_index(['Name', 'status', 'host'], inplace=True)
html_body = df.to_html(bold_rows=False)

The index is set to have hierarchical rows, for easier reading in an html table:

print(df)

                               time1                time2
Name status host                                          
A    ok     1     2020-01-06 06:31:06  2020-02-06 21:10:00
            2     2020-01-06 06:31:06                    -
B    Alert  1     2020-01-06 10:31:06  2020-02-06 21:10:00
     ok     2     2020-01-06 10:31:06  2020-02-06 21:10:00
            4     2020-01-06 10:31:06  2020-02-06 21:10:00
C    Alert  2     2020-01-06 10:31:06  2020-02-06 21:10:00
     ok     3     2020-01-06 10:31:06  2020-02-06 21:10:00
            4                       -                    -

The only solution that I've got working is to set every column to index.
This doesn't seem practical tho, and leaves an empty row that must be manually removed:

将 Pandas 中的索引标题行和列标题行组合起来。

答案1

得分: 4

### 设置

import pandas as pd
from IPython.display import HTML

l0 = ('Foo', 'Bar')
l1 = ('One', 'Two')
ix = pd.MultiIndex.from_product([l0, l1], names=('L0', 'L1'))
df = pd.DataFrame(1, ix, ['W', 'X', 'Y', 'Z'])

HTML(df.to_html())

[![enter image description here][1]][1]
____

### BeautifulSoup
`df.to_html(header=False)` 中获取HTML结果提取表头中的空单元格并插入列名

from bs4 import BeautifulSoup

html_doc = df.to_html(header=False)
soup = BeautifulSoup(html_doc, 'html.parser')

empty_cols = soup.find('thead').find_all(lambda tag: not tag.contents)

for tag, col in zip(empty_cols, df):
    tag.string = col

HTML(soup.decode_contents())

[![enter image description here][2]][2]


  [1]: https://i.stack.imgur.com/bUDFS.png
  [2]: https://i.stack.imgur.com/62eBa.png
英文:

Setup

import pandas as pd
from IPython.display import HTML

l0 = ('Foo', 'Bar')
l1 = ('One', 'Two')
ix = pd.MultiIndex.from_product([l0, l1], names=('L0', 'L1'))
df = pd.DataFrame(1, ix, [*'WXYZ'])

HTML(df.to_html())

将 Pandas 中的索引标题行和列标题行组合起来。


BeautifulSoup

Hack the HTML result from df.to_html(header=False). Pluck out the empty cells in the table head and drop in the column names.

from bs4 import BeautifulSoup

html_doc = df.to_html(header=False)
soup = BeautifulSoup(html_doc, 'html.parser')

empty_cols = soup.find('thead').find_all(lambda tag: not tag.contents)

for tag, col in zip(empty_cols, df):
    tag.string = col

HTML(soup.decode_contents())

将 Pandas 中的索引标题行和列标题行组合起来。

答案2

得分: 1

以下是您要翻译的内容:

如果您想要使用数据框架Styler对表格、元素和内容执行许多精美的格式设置,那么您可能需要稍微修改piRSquared的答案,就像我做的那样。

转换前

style.to_html()添加了不间断空格,这导致标签.contents始终返回true,因此对表格没有任何更改。我修改了lambda以解决这个问题,这揭示了另一个问题。

lambda标签:(不是标签.contents)或'\xa0'在标签.contents中

单元格复制得很奇怪

Styler.to_html()缺少header kwarg - 我猜这是问题的根源。我采取了稍微不同的方法 - 将第二行标题移到第一行,然后销毁第二个标题行。

它似乎非常通用且可重用于任何多索引数据框架。

df_styler = summary_df.style

使用df_styler更改显示格式、颜色、对齐方式等。

raw_html = df_styler.to_html()
soup = BeautifulSoup(raw_html,'html.parser')
head = soup.find('thead')
trs = head.find_all('tr')
ths0 = trs[0].find_all(lambda tag: (not tag.contents) or '\xa0' in tag.contents)
ths1 = trs1.find_all(lambda tag: (tag.contents) or '\xa0' not in tag.contents)
for blank, filled in zip(ths0, ths1):
blank.replace_with(filled)
trs1.decompose()
final_html_str = soup.decode_contents()

成功 - 两个标题行合并为一个

非常感谢piRSquared提供Beautiful Soup的起点!

英文:

If you want to use a Dataframe Styler to perform a lot of wonderful formatting on your table, the elements, and the contents, then you might need a slight change to piRSquared's answer, as I did.

before transformation

style.to_html() added non-breaking spaces which made tag.contents always return true, and thus yielded no change to the table. I modified the lambda to account for this, which revealed another issue.

lambda tag: (not tag.contents) or '\xa0' in tag.contents

Cells were copied strangely

Styler.to_html() lacks the header kwarg - I am guessing this is the source of the issue. I took a slightly different approach - Move the second row headers into the first row, and then destroy the second header row.

It seems pretty generic and reusable for any multi-indexed dataframe.

df_styler = summary_df.style
# Use the df_styler to change display format, color, alignment, etc.
raw_html = df_styler.to_html()
soup = BeautifulSoup(raw_html,'html.parser')
head = soup.find('thead')
trs = head.find_all('tr')
ths0 = trs[0].find_all(lambda tag: (not tag.contents) or '\xa0' in tag.contents)
ths1 = trs[1].find_all(lambda tag: (tag.contents) or '\xa0' not in tag.contents)
for blank, filled in zip(ths0, ths1):
    blank.replace_with(filled)
trs[1].decompose()
final_html_str = soup.decode_contents()

Success - two header rows condensed into one

Big Thanks to piRSquared for the starting point of Beautiful soup!

huangapple
  • 本文由 发表于 2020年1月7日 01:56:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/59616757.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定