英文:
Plotly Express Timeline discrete color feature bug?
问题
I'm using plotly.express.timeline to plot... Timelines. Let me illustrate:
display(plot_df)
# 注意,开始和结束时间日期未被使用
fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig1.show()
到目前为止一切顺利。唯一的问题是,基于类别的颜色映射是连续的,而实际上类别是分类的。所以我将类别列的数据类型更改为字符串,然后再试一次:
plot_df['category'] = plot_df['category'].astype(str)
fig2 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig2.show()
显然出现了问题...但是问题出在哪里呢?
PS:根据输入数据的不同,将类别列转换为字符串时,我会得到不同种类的奇怪结果。另一个示例:
plot_df2 = plot_df[plot_df['start_date'] > pd.to_datetime('2019-05-24')]
fig3 = px.timeline(plot_df2, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig3.show()
我还尝试在不同的环境中可视化这些图,结果类似。
英文:
I'm using plotly.express.timeline to plot... Timelines. Let me illustrate:
display(plot_df)
# Note that start and end time dates are not used
fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig1.show()
So far so good. The only problem is that the category based colormap is continuous, while in reality category is categorical. So I change the category column type to string and try again:
plot_df['category'] = plot_df['category'].astype(str)
fig2 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig2.show()
Obviously something went wrong... But what?
PS: Depending on the input data, I get different kinds of strange results when converting the category column to string. Another example:
plot_df2 = plot_df[plot_df['start_date'] > pd.to_datetime('2019-05-24')]
fig3 = px.timeline(plot_df2, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig3.show()
I also tried visualizing the plots in a different environment, with similar results.
答案1
得分: 0
我无法解释为什么会出现这种情况,但第一个图表假设类别列是一个数字,并显示默认的颜色映射,而第二个图表假设类别列是一个字符串,因此 x 和 y 轴也是时间序列数据。原因在于,单击图例中的类别 1 会隐藏蓝色图表。同样,单击图例中的类别 5 会隐藏红色图表。因此,只有类别 8 被显示。我是通过查看 y 轴的刻度来做出决定的。总之,您想要实现的目标是将类别列更改为字符串,然后将 y 轴类型设置为类别。我还添加了一个设置,可以指定任何颜色。
import pandas as pd
import numpy as np
import io
data = '''
end_time start_date start_time category
"2023-03-08 10:00:00" 2019-05-24 "2023-03-08 09:00:00" 1
"2023-03-08 09:00:00" 2019-05-24 "2023-03-08 08:00:00" 1
"2023-03-08 13:00:00" 2019-05-24 "2023-03-08 10:00:00" 5
"2023-03-08 10:00:00" 2019-05-27 "2023-03-08 07:00:00" 1
"2023-03-08 15:00:00" 2019-05-27 "2023-03-08 10:00:00" 1
"2023-03-08 14:00:00" 2019-05-28 "2023-03-08 07:00:00" 8
"2023-03-08 15:00:00" 2019-05-28 "2023-03-08 14:00:00" 5
'''
plot_df = pd.read_csv(io.StringIO(data), delim_whitespace=True)
plot_df['end_time'] = pd.to_datetime(plot_df['end_time'])
plot_df['start_time'] = pd.to_datetime(plot_df['start_time'])
import plotly.express as px
plot_df['category'] = plot_df['category'].astype(str)
fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category',
color_discrete_sequence=["red", "green", "blue"])
fig1.update_yaxes(type='category')
fig1.show()
第一个图表的截图如下:
第二个图表(类别 8 显示)的截图如下:
英文:
I can't explain exactly why this is the case, but the first graph assumes the category column is a number and displays the default colour map, while the second graph assumes the category column is a string and therefore the x and y axes are also time series data. The reason for this is that clicking on category 1 in the legend hides the blue graph. Similarly, clicking on category 5 in the legend hides the red graph. So only category 8 is shown. I made my decision by looking at the scale of the y-axis. In summary, what you want to achieve is to change the category column to a string and then set the y-axis type to category.I have also added a setting for any color you wish to specify.
import pandas as pd
import numpy as np
import io
data = '''
end_time start_date start_time category
"2023-03-08 10:00:00" 2019-05-24 "2023-03-08 09:00:00" 1
"2023-03-08 09:00:00" 2019-05-24 "2023-03-08 08:00:00" 1
"2023-03-08 13:00:00" 2019-05-24 "2023-03-08 10:00:00" 5
"2023-03-08 10:00:00" 2019-05-27 "2023-03-08 07:00:00" 1
"2023-03-08 15:00:00" 2019-05-27 "2023-03-08 10:00:00" 1
"2023-03-08 14:00:00" 2019-05-28 "2023-03-08 07:00:00" 8
"2023-03-08 15:00:00" 2019-05-28 "2023-03-08 14:00:00" 5
'''
plot_df = pd.read_csv(io.StringIO(data), delim_whitespace=True)
plot_df['end_time'] = pd.to_datetime(plot_df['end_time'])
plot_df['start_time'] = pd.to_datetime(plot_df['start_time'])
import plotly.express as px
plot_df['category'] = plot_df['category'].astype(str)
fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category',
color_discrete_sequence=["red", "green", "blue"])
fig1.update_yaxes(type='category')
fig1.show()
- second graph(category:8 display)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论