x轴值在seaborn的柱状图和点图子图中不是连续的。

huangapple go评论46阅读模式
英文:

x axis value ranges not sequential in seaborn barplot & pointplot as subplots

问题

以下是您的数据框翻译部分:

我的数据框如下

df['graph_df_uni_valid']
             组别  MO_SNCE_REC_APP     标签  预测
0  (-0.001, 25.0]            24324  0.042551     0.042118
1    (25.0, 45.0]            24261  0.035077     0.033748
2    (45.0, 64.0]            23000  0.033391     0.033354
3    (64.0, 83.0]            22960  0.028876     0.028351
4   (83.0, 118.0]            23725  0.028872     0.029056
5  (118.0, 174.0]            23354  0.021024     0.022121
6            丢失                0  0.009165     0.008978
df['graph_df_uni_oot']
             组别  MO_SNCE_REC_APP     标签  预测
0  (-0.001, 25.0]            28942  0.033308     0.041806
1    (25.0, 44.0]            28545  0.027921     0.034701
2    (44.0, 64.0]            27934  0.026634     0.033682
3    (64.0, 83.0]            27446  0.021132     0.028101
4   (83.0, 119.0]            28108  0.022236     0.028721
5  (119.0, 171.0]            27812  0.015892     0.020897
6            丢失                0  0.007614     0.009352

请注意,这些翻译是数据框的部分内容。如果您需要其他方面的帮助,请告诉我。

第二个问题,如何添加图例,将红线标记为'Book rate',绿线标记为'Score',蓝色柱状图标记为'Volume',这需要在代码中添加以下行来创建图例:

import matplotlib.pyplot as plt

# 在相应的图形上添加标签
ax1_line.set_label('Book rate')
ax2_line.set_label('Score')
ax[0].set_label('Volume')

# 在图上添加图例
ax[0].legend(loc='upper right')
ax1_line.legend(loc='upper left')
ax2_line.legend(loc='upper center')

# 显示图形
plt.show()

这将在图形上添加图例,并根据需要设置它们的位置。

英文:

My data frames are:

df['graph_df_uni_valid']

             group  MO_SNCE_REC_APP     Label  predictions
0  (-0.001, 25.0]            24324  0.042551     0.042118
1    (25.0, 45.0]            24261  0.035077     0.033748
2    (45.0, 64.0]            23000  0.033391     0.033354
3    (64.0, 83.0]            22960  0.028876     0.028351
4   (83.0, 118.0]            23725  0.028872     0.029056
5  (118.0, 174.0]            23354  0.021024     0.022121
6            miss                0  0.009165     0.008978

df['graph_df_uni_oot']

             group  MO_SNCE_REC_APP     Label  predictions
0  (-0.001, 25.0]            28942  0.033308     0.041806
1    (25.0, 44.0]            28545  0.027921     0.034701
2    (44.0, 64.0]            27934  0.026634     0.033682
3    (64.0, 83.0]            27446  0.021132     0.028101
4   (83.0, 119.0]            28108  0.022236     0.028721
5  (119.0, 171.0]            27812  0.015892     0.020897
6            miss                0  0.007614     0.009352

Issue is x-axis of Test (& OOT) plot is not in sequential order i.e. bin (11.0 – 102.0] should be the last, NOT 2nd in sequence.
x轴值在seaborn的柱状图和点图子图中不是连续的。

My data is in correct sequence so I used sort=False for pointplot (or lineplot) and order=df['graph_df_uni_valid'].sort_values(by='group').group for barplot. But I get same unordered x-axis with/without these parameters.

Here is my code:

    fig, ax = plt.subplots(nrows = 1, ncols = 2, figsize = (12,5), sharex = False, sharey = True, tight_layout = True)
    fig.supxlabel(desc, ha = 'center', wrap = True)
    fig.suptitle(f"{col} (Rank #{rank}, TotGain: {totgain}, Cum TotGain: {cumtotgain})", fontsize = 16)
  
    ax1_line = ax[0].twinx()
    ax2_line = ax[1].twinx()

   
    
    ax2_line.get_shared_y_axes().join(ax1_line,ax2_line)

    ax[0] = sns.barplot(data = df['graph_df_uni_valid'], ax = ax[0], x = 'group', y = col, color = 'blue', order=df['graph_df_uni_valid'].sort_values(by='group').group)
    ax[0].set(xlabel = '', ylabel = 'Count')
    ax[0].tick_params(axis = 'x', rotation = 60)

    ax1_line = sns.pointplot(data = df['graph_df_uni_valid'], ax = ax1_line, x = 'group', y = target, sort= False, color = 'red', marker = '.')    
    ax1_line = sns.pointplot(data = df['graph_df_uni_valid'], ax = ax1_line, x = 'group', y = sc, sort= False, color = 'green', marker = '.')
    ax1_line.set(xlabel = '', ylabel = 'Book Rate/Score')
    ax[0].set_title('Test (202205 - 202208)')

    ax[1] = sns.barplot(data = df['graph_df_uni_oot'], ax = ax[1], x = 'group', y = col, color = 'blue', order=df['graph_df_uni_oot'].sort_values(by='group').group)
    ax[1].set(xlabel = '', ylabel = 'Count')
    ax[1].tick_params(axis = 'x', rotation = 60)

    ax2_line = sns.pointplot(data = df['graph_df_uni_oot'], x = 'group', y = target, sort= False, color = 'red', marker = '.')
    ax2_line = sns.pointplot(data = df['graph_df_uni_oot'], ax = ax2_line, x = 'group', y = sc, sort=False, color = 'green', marker = '.')
    ax2_line.set(xlabel = '', ylabel = 'Book Rate/Score')    
    ax[1].set_title('OOT (202204)')

If I change barplot parameter order=df['graph_df_uni_valid'].index, I get desired x-axis sequence but bars disappears.
x轴值在seaborn的柱状图和点图子图中不是连续的。

versions

  • matplotlib 3.4.0
  • seaborn 0.10.0

2nd Question How to add legend that red line is 'Book rate', green line is 'Score' & blue bars are volume

答案1

得分: 2

  • 使用 .groupby 聚合数据是不必要的。

  • 尽管在问题中没有显示,但示例的形状表明它已经被使用。

  • sns.barplotsns.pointplot 都有一个 estimator 参数,用于设置用于聚合的统计函数的类型。默认为 'mean'

  • 如果存在聚合,将会有误差条,可以使用 errorbar 参数(在旧版本中为 ci)来移除。

  • 使用 pd.cut 添加一列,默认情况下创建有序的分类排序的箱子,ordered=True

  • 由于它们是有序的,x 轴也将是有序的。

  • 图例:

    • ax1ax1y 上的绘图添加标签
    • 获取句柄和标签
    • 删除轴上的图例
    • 使用合并的句柄和标签创建一个图例
    • 通过更改 locbbox_to_anchor 来查看 如何将图例放在图外 中的其他放置选项。
  • python 3.11.2, pandas 2.0.1, matplotlib 3.7.1, seaborn 0.12.2 中进行了测试

英文:
  • Aggregating the data with .groupby is not necessary.

    • While not shown in the OP, the shape of the sample, indicates it was used.
    • sns.barplot and sns.pointplot both have the estimator parameter for setting the type of statistical function to use for aggregation. The default is 'mean'.
      • If there is aggregation, there will be errorbars, which can be removed with the errorbar parameter (ci in older versions).
  • Add a column with pd.cut, which creates categorically ordered bins, ordered=True, by default.

    • Since they are ordered, the x-axis will be ordered.
  • Legends:

    • Add labels for plots on ax1 and ax1y
    • Get the handles and labels
    • Delete the axes legend
    • Create a figure legend with the combined handles and labels
  • Tested in python 3.11.2, pandas 2.0.1, matplotlib 3.7.1, seaborn 0.12.2

import seaborn as sns
import matplotlib.pyplot as plt

# create the dataframe
df = sns.load_dataset('geyser')

# create the categorically ordered groups
df['group'] = pd.cut(df.duration, bins=np.arange(1.6, 5.2, 0.5), ordered=True)

# create the figure and axes
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(12, 5), sharex=False, sharey=True, tight_layout=True)
ax1y = ax1.twinx()
ax2y = ax2.twinx()

# select the data for ax1
long = df[df.kind.eq('long')]

# plot
sns.barplot(data=long, x='group', y='duration', ax=ax1, color='tab:blue', label='Duration', errorbar=None)
sns.pointplot(data=long, x='group', y='waiting', ax=ax1y, color='tab:red', label='Waiting', errorbar=None)

ax1.set(title='Geyser: short wait time and duration')

# create the legends on ax1 and ax1y
ax1.legend()
ax1y.legend()

# get the legend handles and labels
h1, l1 = ax1.get_legend_handles_labels()
h1y, l1y = ax1y.get_legend_handles_labels()

# remove the axes legend
ax1.get_legend().remove()
ax1y.get_legend().remove()

# add a figure legend from the combined handles and labels
fig.legend(h1 + h1y, l1 + l1y, loc='lower center', ncols=2, bbox_to_anchor=(0.5, 0), frameon=False)

# select the data for ax2
short = df[df.kind.eq('short')]

# plot
sns.barplot(data=short, x='group', y='duration', ax=ax2, color='tab:blue', errorbar=None)
sns.pointplot(data=short, x='group', y='waiting', ax=ax2y, color='tab:red', errorbar=None)

_ = ax2.set(title='Geyser: long wait time and duration')

x轴值在seaborn的柱状图和点图子图中不是连续的。

df.head()

   duration  waiting   kind       group
0     3.600       79   long  (3.1, 3.6]
1     1.800       54  short  (1.6, 2.1]
2     3.333       74   long  (3.1, 3.6]
3     2.283       62  short  (2.1, 2.6]
4     4.533       85   long  (4.1, 4.6]

答案2

得分: 0

由于我的数据顺序正确,所以我只需要在pointplot(或lineplot)中使用sort=False,而在barplot中不需要order参数。我可以得到正确顺序的x轴。

英文:

As my data was in correct sequence so I just have to use sort=False for pointplot (or lineplot) and no order parameter for barplot. I get x-axis in correct order.

x轴值在seaborn的柱状图和点图子图中不是连续的。

huangapple
  • 本文由 发表于 2023年5月25日 07:31:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76327995.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定