使用点而不是条形来制作频率图?

huangapple go评论73阅读模式
英文:

Frequency plot using dots instead of bars?

问题

I'm trying to create the chart in this question, using this answer. I'm open to any solution that works.

Visual borrowed from the original question:
使用点而不是条形来制作频率图?

Difference from that question is I've already calculated my bins and frequency values so I don't use numpy or matplotlib to do so.

Here's my sample data, I refer to it as df_fd in my sample code below:

     low_bin   high_bin  frequency
0  13.142857  18.857143          3
1  18.857143  24.571429          5
2  24.571429  30.285714          8
3  30.285714  36.000000          8
4  36.000000  41.714286          7
5  41.714286  47.428571          7
6  47.428571  53.142857          1
7  53.142857  58.857143          1

Based on the cited question, here's my code (df_fd is the DataFrame above):

fig, ax = plt.subplots()
ax.bar(df_fd.low_bin, df_fd.frequency, width= df_fd.high_bin-df_fd.low_bin)
X,Y = np.meshgrid(bins, df_fd['frequency'])
Y = Y.astype(np.float)
Y[Y>df_fd['frequency']] = np.nan
plt.scatter(X,Y)

This Y[Y>df_fd['frequency']] = np.nan statement is what fails, and I don't know how to get around it. I understand what it's trying to do, and the best guess I have is somehow mapping the matrix index to the DataFrame index would help, but I'm not sure how to do that.

Thank you for helping me!

英文:

I'm trying to create the chart in this question, using this answer. I'm open to any solution that works.

Visual borrowed from original question:
使用点而不是条形来制作频率图?

Difference from that question is I've already calculated my bins and frequency values so I don't use numpy or matplotlib to do so.

Here's my sample data, I refer to it as df_fd in my sample code below:

     low_bin   high_bin  frequency
0  13.142857  18.857143          3
1  18.857143  24.571429          5
2  24.571429  30.285714          8
3  30.285714  36.000000          8
4  36.000000  41.714286          7
5  41.714286  47.428571          7
6  47.428571  53.142857          1
7  53.142857  58.857143          1

Based off the cited question here's my code (df_fd is the DataFrame above):

fig, ax = plt.subplots()
ax.bar(df_fd.low_bin, df_fd.frequency, width= df_fd.high_bin-df_fd.low_bin)
X,Y = np.meshgrid(bins, df_fd['frequency'])
Y = Y.astype(np.float)
Y[Y>df_fd['frequency']] = np.nan
plt.scatter(X,Y)

This Y[Y>df_fd['frequency']] = np.nan statement is what fails and I don't know how to get around it. I understand what it's trying to do and the best guess I have is somehow mapping the matrix index to the DataFrame index would help, but I'm not sure how to do that.

Thank you for helping me!

答案1

得分: 2

使用散点图的一种巧妙解决方案:

(df.assign(bin=np.mean([df['low_bin'], df['high_bin']], axis=0))
   .loc[lambda d: d.index.repeat(tmp['frequency'])]
   .assign(Y=lambda d: d.groupby(level=0).cumcount())
   .plot.scatter(x='bin', y='Y', s=600)
)

它的工作原理是获取低/高的平均值作为X值,然后将行重复多次,次数等于“frequency”的值,并使用groupby.cumcount递增计数。

输出:

使用点而不是条形来制作频率图?

英文:

One hacky solution using a scatter plot:

(df.assign(bin=np.mean([df['low_bin'], df['high_bin']], axis=0))
   .loc[lambda d: d.index.repeat(tmp['frequency'])]
   .assign(Y=lambda d: d.groupby(level=0).cumcount())
   .plot.scatter(x='bin', y='Y', s=600)
)

It works by getting the average of low/high as X value, then repeating the rows as many times as the "frequency" value, and incrementing the count with a groupby.cumcount.

Output:

使用点而不是条形来制作频率图?

huangapple
  • 本文由 发表于 2023年5月13日 20:24:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76242711.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定