Plotly Express时间线离散颜色功能错误?

huangapple go评论60阅读模式
英文:

Plotly Express Timeline discrete color feature bug?

问题

I'm using plotly.express.timeline to plot... Timelines. Let me illustrate:

display(plot_df)
# 注意,开始和结束时间日期未被使用

数据

fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig1.show()

时间轴图 1

到目前为止一切顺利。唯一的问题是,基于类别的颜色映射是连续的,而实际上类别是分类的。所以我将类别列的数据类型更改为字符串,然后再试一次:

plot_df['category'] = plot_df['category'].astype(str)
fig2 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig2.show()

时间轴图 2

显然出现了问题...但是问题出在哪里呢?

PS:根据输入数据的不同,将类别列转换为字符串时,我会得到不同种类的奇怪结果。另一个示例:

plot_df2 = plot_df[plot_df['start_date'] > pd.to_datetime('2019-05-24')]
fig3 = px.timeline(plot_df2, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig3.show()

时间轴图 3

我还尝试在不同的环境中可视化这些图,结果类似。

英文:

I'm using plotly.express.timeline to plot... Timelines. Let me illustrate:

display(plot_df)
# Note that start and end time dates are not used

Data

fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig1.show()

Timeline plot 1

So far so good. The only problem is that the category based colormap is continuous, while in reality category is categorical. So I change the category column type to string and try again:

plot_df['category'] = plot_df['category'].astype(str)
fig2 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig2.show()

Timeline plot 2

Obviously something went wrong... But what?

PS: Depending on the input data, I get different kinds of strange results when converting the category column to string. Another example:

plot_df2 = plot_df[plot_df['start_date'] > pd.to_datetime('2019-05-24')]
fig3 = px.timeline(plot_df2, x_start='start_time', x_end='end_time', y='start_date', color='category')
fig3.show()

Timeline plot 3

I also tried visualizing the plots in a different environment, with similar results.

答案1

得分: 0

我无法解释为什么会出现这种情况,但第一个图表假设类别列是一个数字,并显示默认的颜色映射,而第二个图表假设类别列是一个字符串,因此 x 和 y 轴也是时间序列数据。原因在于,单击图例中的类别 1 会隐藏蓝色图表。同样,单击图例中的类别 5 会隐藏红色图表。因此,只有类别 8 被显示。我是通过查看 y 轴的刻度来做出决定的。总之,您想要实现的目标是将类别列更改为字符串,然后将 y 轴类型设置为类别。我还添加了一个设置,可以指定任何颜色。

import pandas as pd
import numpy as np
import io

data = '''
end_time start_date start_time category
"2023-03-08 10:00:00" 2019-05-24 "2023-03-08 09:00:00" 1
"2023-03-08 09:00:00" 2019-05-24 "2023-03-08 08:00:00" 1
"2023-03-08 13:00:00" 2019-05-24 "2023-03-08 10:00:00" 5
"2023-03-08 10:00:00" 2019-05-27 "2023-03-08 07:00:00" 1
"2023-03-08 15:00:00" 2019-05-27 "2023-03-08 10:00:00" 1
"2023-03-08 14:00:00" 2019-05-28 "2023-03-08 07:00:00" 8
"2023-03-08 15:00:00" 2019-05-28 "2023-03-08 14:00:00" 5 
'''

plot_df = pd.read_csv(io.StringIO(data), delim_whitespace=True)

plot_df['end_time'] = pd.to_datetime(plot_df['end_time'])
plot_df['start_time'] = pd.to_datetime(plot_df['start_time'])

import plotly.express as px
plot_df['category'] = plot_df['category'].astype(str)

fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category', 
                   color_discrete_sequence=["red", "green", "blue"])
fig1.update_yaxes(type='category')
fig1.show()

第一个图表的截图如下:

Plotly Express时间线离散颜色功能错误?

第二个图表(类别 8 显示)的截图如下:

Plotly Express时间线离散颜色功能错误?

英文:

I can't explain exactly why this is the case, but the first graph assumes the category column is a number and displays the default colour map, while the second graph assumes the category column is a string and therefore the x and y axes are also time series data. The reason for this is that clicking on category 1 in the legend hides the blue graph. Similarly, clicking on category 5 in the legend hides the red graph. So only category 8 is shown. I made my decision by looking at the scale of the y-axis. In summary, what you want to achieve is to change the category column to a string and then set the y-axis type to category.I have also added a setting for any color you wish to specify.

import pandas as pd
import numpy as np
import io

data = '''
end_time start_date start_time category
"2023-03-08 10:00:00" 2019-05-24 "2023-03-08 09:00:00" 1
"2023-03-08 09:00:00" 2019-05-24 "2023-03-08 08:00:00" 1
"2023-03-08 13:00:00" 2019-05-24 "2023-03-08 10:00:00" 5
"2023-03-08 10:00:00" 2019-05-27 "2023-03-08 07:00:00" 1
"2023-03-08 15:00:00" 2019-05-27 "2023-03-08 10:00:00" 1
"2023-03-08 14:00:00" 2019-05-28 "2023-03-08 07:00:00" 8
"2023-03-08 15:00:00" 2019-05-28 "2023-03-08 14:00:00" 5 
'''

plot_df = pd.read_csv(io.StringIO(data), delim_whitespace=True)

plot_df['end_time'] = pd.to_datetime(plot_df['end_time'])
plot_df['start_time'] = pd.to_datetime(plot_df['start_time'])

import plotly.express as px
plot_df['category'] = plot_df['category'].astype(str)

fig1 = px.timeline(plot_df, x_start='start_time', x_end='end_time', y='start_date', color='category', 
                   color_discrete_sequence=["red", "green", "blue"])
fig1.update_yaxes(type='category')
fig1.show()

Plotly Express时间线离散颜色功能错误?

  • second graph(category:8 display)

Plotly Express时间线离散颜色功能错误?

huangapple
  • 本文由 发表于 2023年3月8日 17:32:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75671351.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定