英文:
Chart with horizontal bar subcharts for each index bin in dataframe
问题
我有一个带有多级索引的数据框(按“5分钟间隔”和“价格”索引排序)。
我知道使用df.plot.barh()
可以创建简单的水平条形图。此外,我可以通过以下方式遍历数据框,按“5分钟间隔”索引分组:
for date in df.index.levels[0]:
print(df.loc[date])
并获取每个“5分钟间隔”索引的数据框,如下所示:
quantity
price
172.20 330
172.19 1
是否有一种方法可以使用matplotlib创建一个图表,其中每个“5分钟间隔”索引都是水平条形图?在示意图中,它的外观如下所示:
英文:
I have dataframe with multi-index (sorted by "5min_intervals" and "price" indexes).
quantity
5min_intervals price
2023-07-27 17:40:00 172.20 330
172.19 1
2023-07-27 17:45:00 172.25 4
172.24 59
172.23 101
172.22 224
172.21 64
172.20 303
172.19 740
172.18 26
2023-07-27 17:50:00 172.17 30
172.16 2
172.15 1014
172.14 781
172.13 1285
I know about simple horizontal bar chart with df.plot.barh()
. Also I can iterate dataframe by '5min_intervals' index with
for date in df.index.levels[0]:
print(df.loc[date])
and get dataframes for each '5min_intervals' index like below
quantity
price
172.20 330
172.19 1
Is there any way to create one chart with matplotlib where for each '5min_intervals' index would be horizontal bar chart. Schematically it looks like in below picture
答案1
得分: 1
以下是翻译好的代码部分:
# 让我们准备一些虚拟数据来工作:
from pandas import DataFrame, Timestamp
data = {
'quantity': {
(Timestamp('2023-07-27 17:40:00'), 172.2): 330,
(Timestamp('2023-07-27 17:40:00'), 172.19): 1,
(Timestamp('2023-07-27 17:45:00'), 172.25): 4,
(Timestamp('2023-07-27 17:45:00'), 172.24): 59,
(Timestamp('2023-07-27 17:45:00'), 172.23): 101,
(Timestamp('2023-07-27 17:45:00'), 172.22): 224,
(Timestamp('2023-07-27 17:45:00'), 172.21): 64,
(Timestamp('2023-07-27 17:45:00'), 172.2): 303,
(Timestamp('2023-07-27 17:45:00'), 172.19): 740,
(Timestamp('2023-07-27 17:45:00'), 172.18): 26,
(Timestamp('2023-07-27 17:50:00'), 172.17): 30,
(Timestamp('2023-07-27 17:50:00'), 172.16): 2,
(Timestamp('2023-07-27 17:50:00'), 172.15): 1014,
(Timestamp('2023-07-27 17:50:00'), 172.14): 781,
(Timestamp('2023-07-27 17:50:00'), 172.13): 1285
}
}
df = DataFrame(data)
# 关于图形,几乎没有像您要求的那样准备好的东西。因此,我们需要手动构建它。首先,让我们进行一些基本的准备:
unique_time = df.index.get_level_values(0).unique().sort_values()
unique_price = df.index.get_level_values(1).unique().sort_values()
spacing = 0.2 # 两个连续水平线之间的最小距离
values = (1-spacing) * df/df.max() # 水平线的相对长度
base = DataFrame(index=unique_price) # 带有价格的最宽的空白框架
# 要构建图形,我们可以使用`barh(y, width, height, left)` - 水平条形图 - 价格作为第一个参数,时间作为左移,宽度位置放置值,以及一些固定的小高度。为了使非常小的值可见,我们还可以使用`barh`再次标记线的开始(左端)的地方,使用小刻度。
import matplotlib.pyplot as plt
from matplotlib.colors import TABLEAU_COLORS
from itertools import cycle
fig, ax = plt.subplots(figsize=(7,7))
xlabels = unique_time.astype(str)
ylabels = unique_price.astype(str)
ax.set_xticks(range(len(xlabels)), xlabels, rotation=45)
ax.set_yticks(range(len(ylabels)), ylabels)
ax.set_xlim([-0.5,len(xlabels)])
ax.set_ylim([-1, len(ylabels)])
for i, (t, c) in enumerate(zip(unique_time, cycle(TABLEAU_COLORS))):
pr = base.join(values.loc[t]).squeeze()
# 绘制水平线,左移i个时间点
ax.barh(ylabels, pr, 0.1, i, color=c)
# 在线的左端即它们的开始处放置刻度
ax.barh(ylabels, 0.01*pr.notna(), 0.3, i, color=c)
ax.grid(axis='y', linestyle='--', linewidth=0.5)
fig.tight_layout()
plt.show()
这是代码的翻译部分。如果您有其他问题或需要进一步的帮助,请随时提出。
英文:
Let's prepare some dummy data to work with:
from pandas import DataFrame, Timestamp
data = {
'quantity': {
(Timestamp('2023-07-27 17:40:00'), 172.2): 330,
(Timestamp('2023-07-27 17:40:00'), 172.19): 1,
(Timestamp('2023-07-27 17:45:00'), 172.25): 4,
(Timestamp('2023-07-27 17:45:00'), 172.24): 59,
(Timestamp('2023-07-27 17:45:00'), 172.23): 101,
(Timestamp('2023-07-27 17:45:00'), 172.22): 224,
(Timestamp('2023-07-27 17:45:00'), 172.21): 64,
(Timestamp('2023-07-27 17:45:00'), 172.2): 303,
(Timestamp('2023-07-27 17:45:00'), 172.19): 740,
(Timestamp('2023-07-27 17:45:00'), 172.18): 26,
(Timestamp('2023-07-27 17:50:00'), 172.17): 30,
(Timestamp('2023-07-27 17:50:00'), 172.16): 2,
(Timestamp('2023-07-27 17:50:00'), 172.15): 1014,
(Timestamp('2023-07-27 17:50:00'), 172.14): 781,
(Timestamp('2023-07-27 17:50:00'), 172.13): 1285
}
}
df = DataFrame(data)
As for the graphics, there's hardly anything like you asked ready from the box. So we need to build it manually. At first, let's do some basic preparation:
unique_time = df.index.get_level_values(0).unique().sort_values()
unique_price = df.index.get_level_values(1).unique().sort_values()
spacing = 0.2 # a minimum distance between two consecutive horizontal lines
values = (1-spacing) * df/df.max() # relative lengths of horizontal lines
base = DataFrame(index=unique_price) # the widest blank frame with prices
To build the graphics, we can use barh(y, width, height, left)
- horizontal bar - with prices as the first parameter, time as left shifting, values in place of width, and some fixed small height. To make very small values visible, we can additionally mark the beginning (left end) of a line with a small tick using barh
again.
import matplotlib.pyplot as plt
from matplotlib.colors import TABLEAU_COLORS
from itertools import cycle
fig, ax = plt.subplots(figsize=(7,7))
xlabels = unique_time.astype(str)
ylabels = unique_price.astype(str)
ax.set_xticks(range(len(xlabels)), xlabels, rotation=45)
ax.set_yticks(range(len(ylabels)), ylabels)
ax.set_xlim([-0.5,len(xlabels)])
ax.set_ylim([-1, len(ylabels)])
for i, (t, c) in enumerate(zip(unique_time, cycle(TABLEAU_COLORS))):
pr = base.join(values.loc[t]).squeeze()
# draw horizontal lines, shifted left by i-th timepoint
ax.barh(ylabels, pr, 0.1, i, color=c)
# put tics at the left end of lines, i.e. their beginning
ax.barh(ylabels, 0.01*pr.notna(), 0.3, i, color=c)
ax.grid(axis='y', linestyle='--', linewidth=0.5)
fig.tight_layout()
plt.show()
python : 3.11
pandas : 1.5.1
matplotlib : 3.6.1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论