2023年5月30日 02:29:49go评论88阅读模式

英文:

Plot year by year in the same plot (plotly)

问题

我尝试绘制了具有相同起始和结束日期-月份的多年时间序列。

例如，我需要的是来自2018年、2019年等的数据，以便在同一图中比较不同年份的数据。

我所编写的代码是在子图中逐年绘制，但我想要一个单独的Plotly图。

from datetime import date, datetime
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

# 请注意，这里省略了一些导入和数据加载的部分

fig = go.Figure()

# 下面是每年的起始日期和结束日期
start = ["2020-01-01", "2018-01-04", "2021-01-05", "2022-01-05", "2019-01-01", "2023-01-10", "2017-12-27"]
end = ["2020-12-31", "2018-12-27", "2021-12-16", "2022-12-31", "2019-12-22", "2023-05-03", "2017-01-26"]

years = df.index.year.unique()[df.index.year.unique() > 2016].sort_values()

for idx, (s, e) in enumerate(zip(start, end)):
    tmp = df[(df.index >= start[idx]) & (df.index <= end[idx])]
    fig.add_trace(go.Scatter(x=tmp.index,
                             y=tmp,
                             name=str(years[idx]),
                             mode='lines',
                            ))

fig.update_layout(height=600, xaxis_tickformat='%d-%m')
fig.update_xaxes(type='date')

fig.show()

数据下载链接和可运行的脚本链接

英文:

I tried plotting time series with many years in the same start and end day-month.

For example, what I need is data from 01/01 for 2018, 2019, etc in the same plot in order to compare different data from different years.

The code that I do, plot year by year in subplot, but I would like a single plotly plot.

Link of data to download and can run the script

from datetime import date, datetime

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

df_decremento_municipio = pd.read_csv(&#39;app/data/decremento_municipio_202305291512.csv&#39;, index_col=&quot;view_date&quot;) 
template_graph = {
    &quot;layout&quot;: {
        &quot;modebar&quot;: {
            &quot;remove&quot;: [
                &quot;zoom&quot;,
                &quot;pan&quot;,
                &quot;select&quot;,
                &quot;zoomIn&quot;,
                &quot;zoomOut&quot;,
                &quot;lasso2d&quot;,
                &quot;autoscale&quot;,
            ]
        },
        &quot;separators&quot;: &quot;.&quot;,
        &quot;showlegend&quot;: True,
    }
}

df_decremento_municipio.index = pd.to_datetime(df_decremento_municipio.index)
df_decremento_municipio[&quot;year&quot;] = df_decremento_municipio.index.year

min_date = df_decremento_municipio.index.date.min()
max_date = df_decremento_municipio.index.date.max()

df=pd.Series(name=&quot;area_ha&quot;, dtype=&quot;float64&quot;)
for ano in df_decremento_municipio.index.year.unique()[
    df_decremento_municipio.index.year.unique() &gt; 2016
]:
    # Remover os anos de 2015 e 2016 (dados muito ruins)

    df_por_ano = df_decremento_municipio[df_decremento_municipio[&quot;year&quot;] == ano]
    dff_acumulacao = df_por_ano[&quot;area_ha&quot;].groupby([df_por_ano.index]).sum().cumsum()
    df = df.append(dff_acumulacao)

fig = go.Figure()

# Primeira data de cada dado no ano
start = [&quot;2020-01-01&quot;, &quot;2018-01-04&quot;, &quot;2021-01-05&quot; , &quot;2022-01-05&quot;, &quot;2019-01-01&quot;, &quot;2023-01-10&quot;, &quot;2017-12-27&quot;]
# Ultima data de cada dado no ano
end =   [&quot;2020-12-31&quot;, &quot;2018-12-27&quot;, &quot;2021-12-16&quot;, &quot;2022-12-31&quot;, &quot;2019-12-22&quot;,  &quot;2023-05-03&quot;, &quot;2017-01-26&quot;]

years = df.index.year.unique()[df.index.year.unique()&gt;2016].sort_values()

for idx, (s,e) in enumerate(zip(start, end)):
    tmp = df[(df.index &gt;= start[idx]) &amp; (df.index &lt;= end[idx])]
    fig.add_trace(go.Scatter(x=tmp.index,
                             y=tmp,
                             name=str(years[idx]),
                             mode=&#39;lines&#39;,
                            ))

fig.update_layout(height=600, xaxis_tickformat=&#39;%d-%m&#39;)
fig.update_xaxes(type=&#39;date&#39;)

fig.show()

答案1

得分: 0

由于您的x轴包含关于年份的信息，因此绘图中的线将是不连续的（因为每条线都将从上一条线结束的地方开始）。使用xaxis_tickformat='%d-%m'隐藏年份信息在这种情况下不会有帮助，它只会改变视觉效果。

要比较跨年份的时间序列，您可以简单地从图中删除年份信息。在您的add_trace中，将x轴值更改为仅包含月份和日期，使用strftime('%m-%d')：

fig.add_trace(go.Scatter(x=tmp.index.strftime('%m-%d'),
                             y=tmp['dist'],
                             name=str(years[idx]),
                             mode='lines',
                            ))

请注意，您还需要删除fig.update_xaxes(type='date')行，因为数据类型现在是字符串。

英文:

Since your x-axis contains the information about the year, the lines in your plot will be discontinuous (as each line will start where the previous one left off). To hide the year information using xaxis_tickformat='%d-%m' will not help in this case, it will only change the visuals.

What you can do to compare the time series across years is to simply remove the information about the year from the plot. In your add_trace, change the x-axis values to be only the month and day using strftime('%m-%d'):

fig.add_trace(go.Scatter(x=tmp.index.strftime(&#39;%m-%d&#39;),
                             y=tmp[&#39;dist&#39;],
                             name=str(years[idx]),
                             mode=&#39;lines&#39;,
                            ))

Note that you will also need to remove the fig.update_xaxes(type='date') line as the data type is now strings.

答案2

得分: 0

我已经完成的工作，我相信这比在 https://stackoverflow.com/questions/69013744/a-simple-way-to-plot-day-and-month-only-on-the-x-axis-to-compare-years 中的建议更合理。

我创建了一个时间间隔列并将其放在 x 轴上。

Plotly 不能很好地处理列类型为 "timelapse" 的列。但我认为这是我能做的更好选择。

import pandas as pd
import plotly.express as px

df_decremento_municipio = pd.read_csv('app/graficos_dev/data/decremento_municipio_202305291512.csv', index_col="view_date") 

template_graph = {
    "layout": {
        "modebar": {
            "remove": [
                "zoom",
                "pan",
                "select",
                "zoomIn",
                "zoomOut",
                "lasso2d",
                "autoscale",
            ]
        },
        "separators": ".",
        "showlegend": True,
    }
}

df_decremento_municipio.index = pd.to_datetime(df_decremento_municipio.index)

df_decremento_municipio = df_decremento_municipio[["area_ha"]].groupby(df_decremento_municipio.index).sum()

t1 = pd.DataFrame()

for ano in df_decremento_municipio.index.year.unique()[
    df_decremento_municipio.index.year.unique() > 2017
]:  

    df = df_decremento_municipio[df_decremento_municipio.index.year == ano]
    df["year"] = df.index.year
    df["timedelta"] = (df.index - (df.index.year.astype("str") + "-01-01").astype("datetime64[ns]"))/1000000
    df["cumsum"]= df["area_ha"].cumsum()

    t1 = pd.concat([t1, df[["year", "timedelta", "cumsum"]]])

fig = px.line(t1, x="timedelta", y="cumsum", color='year')

fig.update_layout(title="Desflorestamento por Tempo",
    xaxis={"title": "Data"},
    yaxis={"title": "Área (ha)"},
    xaxis_tickformat = '%d-%m'
)

fig.update_xaxes(type='date')

fig.show()

图的结果：

英文:

What I've done, and I belive that is more rational than the suggestion in answer gone in https://stackoverflow.com/questions/69013744/a-simple-way-to-plot-day-and-month-only-on-the-x-axis-to-compare-years

I've created a time lapse column and put it in x axes.

Plotly doesn't work well with column with type of column as "timelapse". But, I think it is better that I can do.

import pandas as pd
import plotly.express as px

df_decremento_municipio = pd.read_csv(&#39;app/graficos_dev/data/decremento_municipio_202305291512.csv&#39;, index_col=&quot;view_date&quot;) 

template_graph = {
    &quot;layout&quot;: {
        &quot;modebar&quot;: {
            &quot;remove&quot;: [
                &quot;zoom&quot;,
                &quot;pan&quot;,
                &quot;select&quot;,
                &quot;zoomIn&quot;,
                &quot;zoomOut&quot;,
                &quot;lasso2d&quot;,
                &quot;autoscale&quot;,
            ]
        },
        &quot;separators&quot;: &quot;.&quot;,
        &quot;showlegend&quot;: True,
    }
}

df_decremento_municipio.index = pd.to_datetime(df_decremento_municipio.index)

df_decremento_municipio = df_decremento_municipio[[&quot;area_ha&quot;]].groupby(df_decremento_municipio.index).sum()

t1 = pd.DataFrame()

for ano in df_decremento_municipio.index.year.unique()[
    df_decremento_municipio.index.year.unique() &gt; 2017
]:  

    df = df_decremento_municipio[df_decremento_municipio.index.year == ano]
    df[&quot;year&quot;] = df.index.year
    df[&quot;timedelta&quot;] = (df.index - (df.index.year.astype(&quot;str&quot;) + &quot;-01-01&quot;).astype(&quot;datetime64[ns]&quot;))/1000000
    df[&quot;cumsum&quot;]= df[&quot;area_ha&quot;].cumsum()

    t1 = pd.concat([t1, df[[&quot;year&quot;, &quot;timedelta&quot;, &quot;cumsum&quot;]]])

fig = px.line(t1, x=&quot;timedelta&quot;, y=&quot;cumsum&quot;, color=&#39;year&#39;)

fig.update_layout(title=&quot;Desflorestamento por Tempo&quot;,
    xaxis={&quot;title&quot;: &quot;Data&quot;},
    yaxis={&quot;title&quot;: &quot;&#193;rea (ha)&quot;},
    xaxis_tickformat = &#39;%d-%m&#39;
)

fig.update_xaxes(type=&#39;date&#39;)

fig.show()

The result of figure:

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Plot year by year in the same plot (plotly)：在同一图中按年份绘制。

问题

答案1

答案2

如何在Python中从同一字符串中提取多个时间？

在Keras模型中保存元数据/信息。

获取字符串中的整数值，该字符串可能不总是包含数字。

如何在Python中添加一个单独的线程以避免应用程序中的延迟？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论