柱状图基于两列数据

huangapple go评论57阅读模式
英文:

Bar plot based on two columns

问题

我已生成下面的数据框,我想绘制一个条形图,其中x轴将具有两个类别,即exp_type值,y轴将具有avg的值。然后为每种类型的磁盘创建一个disk_type的图例。

exp_type disk_type avg
0  Random Read nvme 3120.240000
1  Random Read sda 132.638831
2  Random Read sdb 174.313413
3  Seq Read nvme 3137.849000
4  Seq Read sda 119.171269
5  Seq Read sdb 211.451616

我已尝试使用以下代码进行绘图,但我得到了错误的图。它们应该分组在一起,并带有连接。

def plot(df):
    df.plot(x='exp_type', y=['avg'], kind='bar')
    print(df)

柱状图基于两列数据

英文:

I have generated the dataframe below, I want to plot a bar plot where the x-axis will have two categories i.e. exp_type values and the y-axis will have a value of avg. Then a legend of disk_type for each type of disk.

      exp_type disk_type          avg
0  Random Read      nvme  3120.240000
1  Random Read       sda   132.638831
2  Random Read       sdb   174.313413
3     Seq Read      nvme  3137.849000
4     Seq Read       sda   119.171269
5     Seq Read       sdb   211.451616

I have attempted to use the code below for the plotting but I get the wrong plot. They should be grouped together with links.

def plot(df):
    df.plot(x='exp_type', y=['avg'], kind='bar')
    print(df)

柱状图基于两列数据

答案1

得分: 1

# 重要的是正确地使用 `pivot` 重新塑造你的数据框:

(df.pivot(index='disk_type', columns='exp_type', values='avg').rename_axis(columns='实验类型')
   .plot(kind='bar', rot=0, title='性能', xlabel='磁盘类型', ylabel='IOPS'))

# 或者

(df.pivot(index='exp_type', columns='disk_type', values='avg').rename_axis(columns='磁盘类型')
   .plot(kind='bar', rot=0, title='性能', xlabel='实验类型', ylabel='IOPS'))

输出:

柱状图基于两列数据

柱状图基于两列数据

更新

Pandas 不知道如何分组数据,因为你有一个扁平的数据框(每行一个数值)。 你需要对其进行重新塑造:

>>> df.pivot(index='exp_type', columns='disk_type', values='avg')

exp_type   随机读取        顺序读取    # <- 两个条形图组
disk_type                           
nvme       3120.240000  3137.849000  # <- 每组的第一个条形图
sda         132.638831   119.171269  # <- 每组的第二个条形图
sdb         174.313413   211.451616  # <- 每组的第三个条形图

<details>
<summary>英文:</summary>

The important thing here is to reshape correctly your dataframe with `pivot`:

(df.pivot(index='disk_type', columns='exp_type', values='avg').rename_axis(columns='Exp Type')
.plot(kind='bar', rot=0, title='Performance', xlabel='Disk Type', ylabel='IOPS'))

OR

(df.pivot(index='exp_type', columns='disk_type', values='avg').rename_axis(columns='Disk Type')
.plot(kind='bar', rot=0, title='Performance', xlabel='Exp Type', ylabel='IOPS'))


Output:

[![enter image description here][1]][1]

[![enter image description here][2]][2]

**Update**

Pandas doesn&#39;t understand how to group data because you have a flatten dataframe (one numeric value per row). You have to reshape it:

>>> df.pivot(index='exp_type', columns='disk_type', values='avg')

exp_type Random Read Seq Read # <- Two bar groups
disk_type
nvme 3120.240000 3137.849000 # <- First bar of each group
sda 132.638831 119.171269 # <- Second bar of each group
sdb 174.313413 211.451616 # <- Third bar of each group


  [1]: https://i.stack.imgur.com/4nJpI.png
  [2]: https://i.stack.imgur.com/zEHuU.png

</details>



huangapple
  • 本文由 发表于 2023年2月18日 01:58:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75487762.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定