2023年2月16日 02:52:00go评论57阅读模式

英文:

Date Countdown with pandas

问题

我正在尝试计算一个日期与今天之间的月份差异。以下是我目前的代码：

import pandas as pd
import numpy as np
from datetime import date

def calc_date_countdown(df):
    today = date.today()
    df['countdown'] = df['date'].apply(lambda x: (x - today) / np.timedelta64(1, 'M'))
    df['countdown'] = df['countdown'].astype(int)
    return df

关于您的代码中出现的错误，这是因为您正在尝试将日期（datetime.date）与时间戳（Timestamp）相减，导致了不支持的操作。您可以将日期对象转换为时间戳，然后进行相减。以下是修复错误的代码：

import pandas as pd
import numpy as np
from datetime import date

def calc_date_countdown(df):
    today = pd.to_datetime(date.today())
    df['countdown'] = df['date'].apply(lambda x: ((pd.to_datetime(x) - today) / np.timedelta64(1, 'M')).astype(int))
    return df

这将解决您的错误，并计算日期和今天之间的月份差异。

英文:

I'm trying to calc the different between a date and today in months.

Here is what I have so far:

import pandas as pd
import numpy as np
from datetime import date
def calc_date_countdown(df):
    today = date.today()
    df[&#39;countdown&#39;] = df[&#39;date&#39;].apply(lambda x: (x-today)/np.timedelta64(1, &#39;M&#39;))
    df[&#39;countdown&#39;] = df[&#39;countdown&#39;].astype(int)
    return df

Any pointers on what I'm doing wrong or maybe a more efficient way of doing it?

When I run on my dataset, this is the error I'm getting: TypeError: unsupported operand type(s) for -: 'Timestamp' and 'datetime.date'

答案1

得分: 2

import pandas as pd

def calc_date_countdown(df):
today = pd.Timestamp.today()
df['countdown'] = df['date'].apply(lambda x: (x - today).days // 30)
return df

This should work as long as your date column in the dataframe is a Timestamp object. If it's not, you may need to convert it using pd.to_datetime() before running the function.

英文:

import pandas as pd

def calc_date_countdown(df):
    today = pd.Timestamp.today()
    df[&#39;countdown&#39;] = df[&#39;date&#39;].apply(lambda x: (x - today).days // 30)
    return df

This should work as long as your date column in the dataframe is a Timestamp object. If it's not, you may need to convert it using pd.to_datetime() before running the function.

答案2

得分: 1

使用apply不是很高效，因为这是一个数组操作。请看下面的示例：

from datetime import date, datetime
def per_array(df):
    df['months'] = ((pd.to_datetime(date.today()) - df['date']) / np.timedelta64(1, 'M')).astype(int)
    return df

def using_apply(df):
    today = date.today()
    df['months'] = df['date'].apply(lambda x: (x - pd.to_datetime(today)) / np.timedelta64(1, 'M'))
    df['months'] = df['months'].astype(int)
    return df

df = pd.DataFrame({'date': [pd.to_datetime(f"2023-0{i}-01") for i in range(1, 8)]})
print(df)
#         date
# 0 2023-01-01
# 1 2023-02-01
# 2 2023-03-01
# 3 2023-04-01
# 4 2023-05-01
# 5 2023-06-01
# 6 2023-07-01

计时：

%%timeit 
per_array(df)
195 µs ± 5.14 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%%timeit 
using_apply(df)
384 µs ± 3.22 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

如您所见，不使用apply大约快了一倍。

英文:

Using apply is not very efficient, as this is an array operation.

See the below example:

from datetime import date, datetime 
def per_array(df):
    df[&#39;months&#39;] = ((pd.to_datetime(date.today()) - df[&#39;date&#39;]) / np.timedelta64(1, &#39;M&#39;)).astype(int)
    return df

def using_apply(df):
    today = date.today()
    df[&#39;months&#39;] = df[&#39;date&#39;].apply(lambda x: (x-pd.to_datetime(today))/np.timedelta64(1, &#39;M&#39;))
    df[&#39;months&#39;] = df[&#39;months&#39;].astype(int)
    return df

df = pd.DataFrame({&#39;date&#39;: [pd.to_datetime(f&quot;2023-0{i}-01&quot;) for i in range(1,8)]})
print(df)
#         date
# 0 2023-01-01
# 1 2023-02-01
# 2 2023-03-01
# 3 2023-04-01
# 4 2023-05-01
# 5 2023-06-01
# 6 2023-07-01

Timing it:

%%timeit 
per_array(df)
195 &#181;s &#177; 5.14 &#181;s per loop (mean &#177; std. dev. of 7 runs, 1,000 loops each)

%%timeit 
using_apply(df)
384 &#181;s &#177; 3.22 &#181;s per loop (mean &#177; std. dev. of 7 runs, 1,000 loops each)

As you can see, it is around twice as fast to not use apply.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

日期倒计时使用pandas

问题

答案1

答案2

分割Pandas列表列

如何使用Polars语法重新创建以下查询？

在数据框中通过另一列上的条件搜索数值。

如何将粗体文本、日期和时间连接到特定单元格？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论