应用高亮于数据透视表

huangapple go评论114阅读模式
英文:

Apply highlight to pivot_table

问题

我通过pivot_table函数获取DataFrame:

df2 = pd.pivot_table(df, values=['labor costs'],
                     index=['Division', 'Performer'],
                     columns=['completed on time'], aggfunc=[np.sum, len], margins=True, fill_value=0)

如何根据条件突出显示行:on schedule == 0 and overdue == 0,就像上面的表格一样?

我这样做:

def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
    styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
    print('df_slice.index', df_slice.index)
    print('df_slice.columns', df_slice.columns)
    styles_df['Performer'] = np.select([
        # 条件 1
        df_slice['overdue'] == 0 & df_slice['on schedule'] == 0
    ], [
        # 条件 1 的颜色
        'background-color: silver',
    ])
    return styles_df

df2.style.apply(apply_colors, axis=None)

然后出现了 KeyError: 'overdue'。

#print('df_slice.columns', df_slice.columns) 
df_slice.columns MultiIndex([('sum', 'labor costs', 'on time'),
                ('sum', 'labor costs', 'on schedule'),
                ('sum', 'labor costs', 'overdue'),
                ('len', 'labor costs', 'on time'),
                ('len', 'labor costs', 'on schedule'),
                ('len', 'labor costs', 'overdue')],
               names=[None, None, 'completed on time'])
英文:

I get DataFrame by function pivot_table:

df2 = pd.pivot_table(df, values=['labor costs],
                         index=['Division', 'Perfomer'],
                         columns=['completed on time'], aggfunc=[np.sum, len], margins=True, fill_value=0)

应用高亮于数据透视表
How can I highlight the row by condition: on schedule == 0 and overdue == 0 like table above?

I do:

def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
        styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
        print('df_slice.index', df_slice.index)
        print('df_slice.columns', df_slice.columns)
        styles_df['Perfomer'] = np.select([
            # Condition 1
            df_slice['overdue'] == 0 & df_slice['on schedule'] == 0

        ], [
            # Color for Condition 1
            'background-color: silver',

        ])
        return styles_df

df2.style.apply(apply_colors, axis=None)

And get: KeyError: 'overdue'

#print('df_slice.columns', df_slice.columns) 
df_slice.columns MultiIndex([('sum', 'labor costs', 'on time'),
                ('sum', 'labor costs',       'on schedule'),
                ('sum', 'labor costs',       'overdue'),
                ('len', 'labor costs', 'on time'),
                ('len', 'labor costs',       'on schedule'),
                ('len', 'labor costs',       'overdue'),
               names=[None, None, 'completed on time'])

答案1

得分: 1

我认为你需要在 DataFrame.any 函数中添加括号,用于测试是否至少有一个匹配,将 default 参数添加到 numpy.select 函数中,以便在未匹配的情况下添加空格,用 numpy.broadcast_to 函数来重复着色:

def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
    
    arr = np.select([
        # 条件 1
        ((df_slice.xs('overdue', axis=1, level=2) == 0) &
        (df_slice.xs('on schedule', axis=1, level=2) == 0)).any(axis=1)
    
    ], [
        # 条件 1 的颜色
        'background-color: silver',
    
    ], default='')
    
    return pd.DataFrame(np.broadcast_to(arr[:, None], df_slice.shape),
                        index=df_slice.index,
                        columns=df_slice.columns)
英文:

I think you need add parantheses with DataFrame.any for test at least one match, default parameter to numpy.select for space if not matched masks, for repeat coloring is used numpy.broadcast_to:

def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:

        # print('df_slice.index', df_slice.index)
        # print('df_slice.columns', df_slice.columns)

        arr = np.select([
            # Condition 1
            ((df_slice.xs('overdue', axis=1, level=2) == 0) & 
            (df_slice.xs('on schedule', axis=1, level=2) == 0)).any(axis=1)

        ], [
            # Color for Condition 1
            'background-color: silver',

        ], default='')

        return pd.DataFrame(np.broadcast_to(arr[:, None], df_slice.shape),
                            index=df_slice.index,
                            columns=df_slice.columns)

huangapple
  • 本文由 发表于 2023年8月10日 15:53:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76873668.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定