英文:
Apply highlight to pivot_table
问题
我通过pivot_table函数获取DataFrame:
df2 = pd.pivot_table(df, values=['labor costs'],
index=['Division', 'Performer'],
columns=['completed on time'], aggfunc=[np.sum, len], margins=True, fill_value=0)
如何根据条件突出显示行:on schedule == 0 and overdue == 0,就像上面的表格一样?
我这样做:
def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
print('df_slice.index', df_slice.index)
print('df_slice.columns', df_slice.columns)
styles_df['Performer'] = np.select([
# 条件 1
df_slice['overdue'] == 0 & df_slice['on schedule'] == 0
], [
# 条件 1 的颜色
'background-color: silver',
])
return styles_df
df2.style.apply(apply_colors, axis=None)
然后出现了 KeyError: 'overdue'。
#print('df_slice.columns', df_slice.columns)
df_slice.columns MultiIndex([('sum', 'labor costs', 'on time'),
('sum', 'labor costs', 'on schedule'),
('sum', 'labor costs', 'overdue'),
('len', 'labor costs', 'on time'),
('len', 'labor costs', 'on schedule'),
('len', 'labor costs', 'overdue')],
names=[None, None, 'completed on time'])
英文:
I get DataFrame by function pivot_table:
df2 = pd.pivot_table(df, values=['labor costs],
index=['Division', 'Perfomer'],
columns=['completed on time'], aggfunc=[np.sum, len], margins=True, fill_value=0)
How can I highlight the row by condition: on schedule == 0 and overdue == 0 like table above?
I do:
def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
print('df_slice.index', df_slice.index)
print('df_slice.columns', df_slice.columns)
styles_df['Perfomer'] = np.select([
# Condition 1
df_slice['overdue'] == 0 & df_slice['on schedule'] == 0
], [
# Color for Condition 1
'background-color: silver',
])
return styles_df
df2.style.apply(apply_colors, axis=None)
And get: KeyError: 'overdue'
#print('df_slice.columns', df_slice.columns)
df_slice.columns MultiIndex([('sum', 'labor costs', 'on time'),
('sum', 'labor costs', 'on schedule'),
('sum', 'labor costs', 'overdue'),
('len', 'labor costs', 'on time'),
('len', 'labor costs', 'on schedule'),
('len', 'labor costs', 'overdue'),
names=[None, None, 'completed on time'])
答案1
得分: 1
我认为你需要在 DataFrame.any
函数中添加括号,用于测试是否至少有一个匹配,将 default
参数添加到 numpy.select
函数中,以便在未匹配的情况下添加空格,用 numpy.broadcast_to
函数来重复着色:
def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
arr = np.select([
# 条件 1
((df_slice.xs('overdue', axis=1, level=2) == 0) &
(df_slice.xs('on schedule', axis=1, level=2) == 0)).any(axis=1)
], [
# 条件 1 的颜色
'background-color: silver',
], default='')
return pd.DataFrame(np.broadcast_to(arr[:, None], df_slice.shape),
index=df_slice.index,
columns=df_slice.columns)
英文:
I think you need add parantheses with DataFrame.any
for test at least one match, default
parameter to numpy.select
for space if not matched masks, for repeat coloring is used numpy.broadcast_to
:
def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
# print('df_slice.index', df_slice.index)
# print('df_slice.columns', df_slice.columns)
arr = np.select([
# Condition 1
((df_slice.xs('overdue', axis=1, level=2) == 0) &
(df_slice.xs('on schedule', axis=1, level=2) == 0)).any(axis=1)
], [
# Color for Condition 1
'background-color: silver',
], default='')
return pd.DataFrame(np.broadcast_to(arr[:, None], df_slice.shape),
index=df_slice.index,
columns=df_slice.columns)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论