故障排除 pd.to_timedelta 计算失败

huangapple go评论61阅读模式
英文:

Trouble shooting pd.to_timedelta calculation failure

问题

我先用中文翻译代码部分:

for RxDat in df2:
       condition = (df['Tdate'] > RxDat - pd.to_timedelta(46, unit="D")) & (df['Tdate'] < RxDat)

您提到您之前成功使用了上述条件语句,但现在遇到了以下错误:

TypeError: unsupported operand type(s) for -: 'str' and 'Timedelta'

您提供了以下数据来说明错误:

df['Tdate'] 包含的日期数据如下:

[Timestamp('2004-08-25 00:00:00'), Timestamp('2004-10-13 00:00:00'), Timestamp('2004-12-13 00:00:00'), Timestamp('2005-02-21 00:00:00'), Timestamp('2005-04-28 00:00:00'), Timestamp('2005-08-24 00:00:00')]

df2['RxDate'] 包含的日期数据如下:

[Timestamp('2004-08-20 00:00:00'), Timestamp('2004-08-23 00:00:00'), Timestamp('2004-08-18 00:00:00'), Timestamp('2004-08-15 00:00:00'), Timestamp('2004-08-12 00:00:00'), Timestamp('2004-08-13 00:00:00')]

您尝试了多种方式但无法找出错误的原因。

英文:

I have previously used the following conditional statement successfully

for RxDat in df2:

       condition = (df[&#39;Tdate&#39;] &gt; RxDat - pd.to_timedelta(46, unit=&quot;D&quot;)) &amp; (df[&#39;Tdate&#39;] &lt; RxDat)

Now I am getting the following error

> TypeError: unsupported operand type(s) for -: 'str' and 'Timedelta'

I have extracted the following data to illustrate the error

df[&#39;Tdate&#39;] contains

[Timestamp(&#39;2004-08-25 00:00:00&#39;), Timestamp(&#39;2004-10-13 00:00:00&#39;), Timestamp(&#39;2004-12-13 00:00:00&#39;), Timestamp(&#39;2005-02-21 00:00:00&#39;), Timestamp(&#39;2005-04-28 00:00:00&#39;), Timestamp(&#39;2005-08-24 00:00:00&#39;)]

df2[&#39;RxDate&#39;] contains

[Timestamp(&#39;2004-08-20 00:00:00&#39;), Timestamp(&#39;2004-08-23 00:00:00&#39;), Timestamp(&#39;2004-08-18 00:00:00&#39;), Timestamp(&#39;2004-08-15 00:00:00&#39;), Timestamp(&#39;2004-08-12 00:00:00&#39;), Timestamp(&#39;2004-08-13 00:00:00&#39;)]

I have tried looking at this a few ways and cannot see why I get the error?

答案1

得分: 3

如果循环由 d2RxDat 为列名:

for RxDat in df2:

使用:

for RxDat in df2['RxDate']:

非循环解决方案,使用广播,输出为 2D 的 NumPy 数组:

a = df['Tdate'].to_numpy()[:, None]
b = df2['RxDate'].sub(pd.to_timedelta(46, unit="D")).to_numpy()
c = df2['RxDate'].to_numpy()

condition = (a > b) & (a < c)
英文:

If loop by d2 then RxDat are columns names:

for RxDat in df2:

Use:

for RxDat in df2[&#39;RxDate&#39;]:

Non loop solution with broadcasting, output is 2d numpy array:

a = df[&#39;Tdate&#39;].to_numpy()[:, None]
b = df2[&#39;RxDate&#39;].sub(pd.to_timedelta(46, unit=&quot;D&quot;)).to_numpy()
c = df2[&#39;RxDate&#39;].to_numpy()
              
condition = (a &gt; b) &amp; (a &lt; c)

huangapple
  • 本文由 发表于 2023年3月9日 14:15:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75681010.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定