英文:
Combine index header row and column header row in Pandas
问题
I create a dataframe and export to an html table. However the headers are off as below
How can I combine the index name row, and the column name row?
I want the table header to look like this:
but it currently exports to html like this:
I create the dataframe as below (example):
data = [{'Name': 'A', 'status': 'ok', 'host': '1', 'time1': '2020-01-06 06:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'A', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 06:31:06', 'time2': '-'}, {'Name': 'B', 'status': 'Alert', 'host': '1', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'B', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'B', 'status': 'ok', 'host': '4', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'Alert', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '3', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '4', 'time1': '-', 'time2': '-'}]
df = pandas.DataFrame(data)
df.set_index(['Name', 'status', 'host'], inplace=True)
html_body = df.to_html(bold_rows=False)
The index is set to have hierarchical rows, for easier reading in an html table:
print(df)
time1 time2
Name status host
A ok 1 2020-01-06 06:31:06 2020-02-06 21:10:00
2 2020-01-06 06:31:06 -
B Alert 1 2020-01-06 10:31:06 2020-02-06 21:10:00
ok 2 2020-01-06 10:31:06 2020-02-06 21:10:00
4 2020-01-06 10:31:06 2020-02-06 21:10:00
C Alert 2 2020-01-06 10:31:06 2020-02-06 21:10:00
ok 3 2020-01-06 10:31:06 2020-02-06 21:10:00
4 - -
The only solution that I've got working is to set every column to index.
This doesn't seem practical tho, and leaves an empty row that must be manually removed:
英文:
I create a dataframe and export to an html table. However the headers are off as below
How can I combine the index name row, and the column name row?
I want the table header to look like this:
but it currently exports to html like this:
I create the dataframe as below (example):
data = [{'Name': 'A', 'status': 'ok', 'host': '1', 'time1': '2020-01-06 06:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'A', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 06:31:06', 'time2': '-'}, {'Name': 'B', 'status': 'Alert', 'host': '1', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'B', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'B', 'status': 'ok', 'host': '4', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'Alert', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '3', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '4', 'time1': '-', 'time2': '-'}]
df = pandas.DataFrame(data)
df.set_index(['Name', 'status', 'host'], inplace=True)
html_body = df.to_html(bold_rows=False)
The index is set to have hierarchical rows, for easier reading in an html table:
print(df)
time1 time2
Name status host
A ok 1 2020-01-06 06:31:06 2020-02-06 21:10:00
2 2020-01-06 06:31:06 -
B Alert 1 2020-01-06 10:31:06 2020-02-06 21:10:00
ok 2 2020-01-06 10:31:06 2020-02-06 21:10:00
4 2020-01-06 10:31:06 2020-02-06 21:10:00
C Alert 2 2020-01-06 10:31:06 2020-02-06 21:10:00
ok 3 2020-01-06 10:31:06 2020-02-06 21:10:00
4 - -
The only solution that I've got working is to set every column to index.
This doesn't seem practical tho, and leaves an empty row that must be manually removed:
答案1
得分: 4
### 设置
import pandas as pd
from IPython.display import HTML
l0 = ('Foo', 'Bar')
l1 = ('One', 'Two')
ix = pd.MultiIndex.from_product([l0, l1], names=('L0', 'L1'))
df = pd.DataFrame(1, ix, ['W', 'X', 'Y', 'Z'])
HTML(df.to_html())
[![enter image description here][1]][1]
____
### BeautifulSoup
从 `df.to_html(header=False)` 中获取HTML结果。提取表头中的空单元格并插入列名。
from bs4 import BeautifulSoup
html_doc = df.to_html(header=False)
soup = BeautifulSoup(html_doc, 'html.parser')
empty_cols = soup.find('thead').find_all(lambda tag: not tag.contents)
for tag, col in zip(empty_cols, df):
tag.string = col
HTML(soup.decode_contents())
[![enter image description here][2]][2]
[1]: https://i.stack.imgur.com/bUDFS.png
[2]: https://i.stack.imgur.com/62eBa.png
英文:
Setup
import pandas as pd
from IPython.display import HTML
l0 = ('Foo', 'Bar')
l1 = ('One', 'Two')
ix = pd.MultiIndex.from_product([l0, l1], names=('L0', 'L1'))
df = pd.DataFrame(1, ix, [*'WXYZ'])
HTML(df.to_html())
BeautifulSoup
Hack the HTML result from df.to_html(header=False)
. Pluck out the empty cells in the table head and drop in the column names.
from bs4 import BeautifulSoup
html_doc = df.to_html(header=False)
soup = BeautifulSoup(html_doc, 'html.parser')
empty_cols = soup.find('thead').find_all(lambda tag: not tag.contents)
for tag, col in zip(empty_cols, df):
tag.string = col
HTML(soup.decode_contents())
答案2
得分: 1
以下是您要翻译的内容:
如果您想要使用数据框架Styler对表格、元素和内容执行许多精美的格式设置,那么您可能需要稍微修改piRSquared的答案,就像我做的那样。
style.to_html()添加了不间断空格,这导致标签.contents始终返回true,因此对表格没有任何更改。我修改了lambda以解决这个问题,这揭示了另一个问题。
lambda标签:(不是标签.contents)或'\xa0'在标签.contents中
Styler.to_html()缺少header kwarg - 我猜这是问题的根源。我采取了稍微不同的方法 - 将第二行标题移到第一行,然后销毁第二个标题行。
它似乎非常通用且可重用于任何多索引数据框架。
df_styler = summary_df.style
使用df_styler更改显示格式、颜色、对齐方式等。
raw_html = df_styler.to_html()
soup = BeautifulSoup(raw_html,'html.parser')
head = soup.find('thead')
trs = head.find_all('tr')
ths0 = trs[0].find_all(lambda tag: (not tag.contents) or '\xa0' in tag.contents)
ths1 = trs1.find_all(lambda tag: (tag.contents) or '\xa0' not in tag.contents)
for blank, filled in zip(ths0, ths1):
blank.replace_with(filled)
trs1.decompose()
final_html_str = soup.decode_contents()
非常感谢piRSquared提供Beautiful Soup的起点!
英文:
If you want to use a Dataframe Styler to perform a lot of wonderful formatting on your table, the elements, and the contents, then you might need a slight change to piRSquared's answer, as I did.
style.to_html() added non-breaking spaces which made tag.contents always return true, and thus yielded no change to the table. I modified the lambda to account for this, which revealed another issue.
lambda tag: (not tag.contents) or '\xa0' in tag.contents
Styler.to_html() lacks the header kwarg - I am guessing this is the source of the issue. I took a slightly different approach - Move the second row headers into the first row, and then destroy the second header row.
It seems pretty generic and reusable for any multi-indexed dataframe.
df_styler = summary_df.style
# Use the df_styler to change display format, color, alignment, etc.
raw_html = df_styler.to_html()
soup = BeautifulSoup(raw_html,'html.parser')
head = soup.find('thead')
trs = head.find_all('tr')
ths0 = trs[0].find_all(lambda tag: (not tag.contents) or '\xa0' in tag.contents)
ths1 = trs[1].find_all(lambda tag: (tag.contents) or '\xa0' not in tag.contents)
for blank, filled in zip(ths0, ths1):
blank.replace_with(filled)
trs[1].decompose()
final_html_str = soup.decode_contents()
Success - two header rows condensed into one
Big Thanks to piRSquared for the starting point of Beautiful soup!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论