Matplotlib组合数据点

huangapple go评论75阅读模式
英文:

Matplotlib combining datapoints

问题

我想绘制这个数据框,但是当我尝试时看起来像这样:Matplotlib组合数据点。如何绘制这个数据框,而不将列出两次的电影合并在一起?(即“壮志凌云:疯狂飞车”和“阿凡达2:水之道”)

英文:

I have this dataframe:
Matplotlib组合数据点

I would like to plot the movies in order, but when I tried it looks like this:
Matplotlib组合数据点

How do I plot this dataframe without combining movies that are listed twice? (ie- "Top Gun: Maverick" and "Avatar: The Way Of Water)

答案1

得分: 1

你可以尝试使用Pandas中的barh函数:

data2 = pd.DataFrame({'Film': ['壮志凌云:疾速特攻', '蝙蝠侠', '壮志凌云:疾速特攻'],
                      '周末票房': [1, 2, 3]})

data2 = data2.sort_values('周末票房')

颜色 = ['红色', '橙色', '橙绿色']

图, 轴 = plt.subplots(figsize=(10, 7))
data2.plot.barh('Film', '周末票房', color=颜色, legend=False, title='哪部电影在一个周末中票房最高', ax=轴)
plt.tight_layout()
plt.show()

输出:

Matplotlib组合数据点

英文:

You can try with barh from Pandas:

data2 = pd.DataFrame({'Film': ['Top Gun: Maverick', 'The Batman', 'Top Gun: Maverick'],
                      'Weekend_gross': [1, 2, 3]})

data2 = data2.sort_values('Weekend_gross')

clrs = ['r', 'orange', 'chartreuse']

fig, ax = plt.subplots(figsize=(10, 7))
data2.plot.barh('Film', 'Weekend_gross', color=clrs, legend=False, title='What movie Grossed the Most in 1 Weekend', ax=ax)
plt.tight_layout()
plt.show()

Output:

Matplotlib组合数据点

答案2

得分: 0

ax.barh内,不必传递data2['Film']y,而是可以传递一系列连续的数字,然后使用ax.set_yticks来设置data2['Film']的值作为yticks。例如:

import pandas as pd
import matplotlib.pyplot as plt

data2 = pd.DataFrame({'Film': ['电影A', '电影B', '电影A', '电影D'],
                      '周末票房': [1, 3, 7, 5]})

data2 = data2.sort_values('周末票房')

clrs = ['青色', '橙色', '青绿色', '紫色']

fig, ax = plt.subplots(figsize=(10, 7))

# 创建一个与数据长度相同的范围
idx = range(len(data2))

# 将'idx'传递给'y'参数
ax.barh(idx, data2['周末票房'], 
        color=clrs, edgecolor='黑色')
ax.set(title='哪部电影周末票房最高', 
       ylabel='电影')

# 使用'Film'列设置'yticks'
ax.set_yticks(idx, data2['Film'])
plt.show()

结果:

Matplotlib组合数据点

当然,如果您不介意重置索引,您也可以直接使用data2.index。例如:

# 重置索引:
data2 = data2.sort_values('周末票房').reset_index(drop=True)

...
# 作为`y`传递
ax.barh(data2.index, data2['周末票房'], 
        color=clrs, edgecolor='黑色')

...
# 并设置yticks
ax.set_yticks(data2.index, data2['Film'])
英文:

Instead of passing data2['Film] to y inside ax.barh, you can pass a range of consecutive numbers, and then set the values of data2['Film'] as the yticks using ax.set_yticks. E.g.:

import pandas as pd
import matplotlib.pyplot as plt

data2 = pd.DataFrame({'Film': ['Film A', 'Film B', 'Film A', 'Film D'],
                      'Weekend_gross': [1, 3, 7, 5]})

data2 = data2.sort_values('Weekend_gross')

clrs = ['teal', 'orange', 'cyan', 'violet']

fig, ax = plt.subplots(figsize=(10, 7))

# create a range with length of your data
idx = range(len(data2))

# pass `idx` to `y` param
ax.barh(idx, data2['Weekend_gross'], 
        color=clrs, edgecolor='black')
ax.set(title='What movie Grossed the Most in 1 Weekend', 
       ylabel='Film')

# set `yticks` with `Film` column
ax.set_yticks(idx, data2['Film'])
plt.show()

Result:

Matplotlib组合数据点

Of course, if you don't mind resetting your index, you can just use data2.index for this. E.g.:

# reset index:
data2 = data2.sort_values('Weekend_gross').reset_index(drop=True)

...
# pass as `y`
ax.barh(data2.index, data2['Weekend_gross'], 
        color=clrs, edgecolor='black')

...
# and set yticks
ax.set_yticks(data2.index, data2['Film'])

huangapple
  • 本文由 发表于 2023年5月11日 05:22:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76222637.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定