如何在只针对一个列上使用pandas dataframe.where

huangapple go评论70阅读模式
英文:

How to apply pandas dataframe.where on just one column

问题

我正在尝试使用dataframe.where将一些天数添加到数据框列中,但它的行为很奇怪。以下是数据框、当前结果和预期结果。

my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30],
         'Result Date': ['2021-08-01', '2021-04-24', '2021-09-30', '2021-07-13', '2021-09-30', '2021-07-13'],
}
my_data = pd.DataFrame(data=my_dict)
my_data['Result Date'] = pd.to_datetime(my_data['Result Date'])

my_data['Result Date'] = my_data.where(my_data["ID"] > 5, my_data['Result Date'] + pd.Timedelta(days=14), axis=1)

当前结果:
如何在只针对一个列上使用pandas dataframe.where

预期结果:
如何在只针对一个列上使用pandas dataframe.where

英文:

I'm trying to add some days to a dataframe column using dataframe.where, but it behaves weirdly. Here is the dataframe, current result and expected result.

my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30],
         'Result Date': ['2021-08-01', '2021-04-24', '2021-09-30', '2021-07-13', '2021-09-30', '2021-07-13'],
}
my_data = pd.DataFrame(data=my_dict)
my_data['Result Date'] = pd.to_datetime(my_data['Result Date'])

my_data['Result Date'] = my_data.where(my_data["ID"] > 5, my_data['Result Date'] + pd.Timedelta(days=14), axis=1)

Current result:
如何在只针对一个列上使用pandas dataframe.where

Expected result:
如何在只针对一个列上使用pandas dataframe.where

答案1

得分: 1

在my_dict中假设您输入了到达时间的拼写错误,因为您没有提及到达时间。

以下是更新后的代码:

import pandas as pd
import numpy as np

my_dict = {'NAME': ['Ravi', 'Raju', 'Alex', 'Ron', 'King', 'Jack'],
           'ID': [1, 2, 3, 4, 5, 6],
           'MATH': [80, 40, 70, 70, 82, 30],
           'ENGLISH': [81, 70, 40, 50, 60, 30],
           'Result Date': ['2021-08-01', '2021-04-24', '2021-09-30', '2021-07-13', '2021-09-30', '2021-07-13']}
my_data = pd.DataFrame(data=my_dict)
my_data['Result Date'] = pd.to_datetime(my_data['Result Date'])

my_data['Result Date'] = np.where(my_data["ID"] > 5, my_data['Result Date'] + pd.Timedelta(days=14), None)
my_data['Result Date'] = pd.to_datetime(my_data['Result Date'])
print(my_data)

输出:

   NAME  ID  MATH  ENGLISH Result Date
0  Ravi   1    80       81         NaT
1  Raju   2    40       70         NaT
2  Alex   3    70       40         NaT
3   Ron   4    70       50         NaT
4  King   5    82       60         NaT
5  Jack   6    30       30  2021-07-27
英文:

assuming you had a typo error with Arrival time as you haven't mentioned anything about Arrival time in the my_dict

here is the updated code

import pandas as pd 
import numpy as np 

my_dict={'NAME':['Ravi','Raju','Alex','Ron','King','Jack'],
         'ID':[1,2,3,4,5,6],
         'MATH':[80,40,70,70,82,30],
         'ENGLISH':[81,70,40,50,60,30],
         'Result Date': ['2021-08-01', '2021-04-24', '2021-09-30', '2021-07-13', '2021-09-30', '2021-07-13'],
}
my_data = pd.DataFrame(data=my_dict)
my_data['Result Date'] = pd.to_datetime(my_data['Result Date'])

my_data['Result Date'] = np.where(my_data["ID"] > 5, my_data['Result Date'] + pd.Timedelta(days=14), None)
my_data['Result Date'] = pd.to_datetime(my_data['Result Date'])
print(my_data)

output:

NAME  ID  MATH  ENGLISH Result Date
0  Ravi   1    80       81         NaT
1  Raju   2    40       70         NaT
2  Alex   3    70       40         NaT
3   Ron   4    70       50         NaT
4  King   5    82       60         NaT
5  Jack   6    30       30  2021-07-27

答案2

得分: 0

下面的代码可以获取所需的结果。Yashaswi K的答案有效,但我们需要使用另一个库Numpy,但我正在寻找使用Pandas的解决方案。

my_data['Result Date'] = (my_data['Result Date'] + pd.Timedelta(days=14)).where(my_data["ID"] > 5, None)
英文:

The code below works to get the desired result. The answer by Yashaswi K works, but we need to use another library, Numpy, but I was looking for something using Pandas.

my_data['Result Date'] = (my_data['Result Date'] + pd.Timedelta(days=14)).where(my_data["ID"] > 5, None )

答案3

得分: 0

使用where在特定列上:

my_data['Result Date'] = my_data['Result Date'].where(my_data["ID"] > 5) + pd.Timedelta(days=14)
英文:

Use where on the specific column:

my_data['Result Date'] = my_data['Result Date'].where(my_data["ID"].gt(5))+pd.Timedelta(days=14)


>>> my_data
   NAME  ID  MATH  ENGLISH Result Date
0  Ravi   1    80       81         NaT
1  Raju   2    40       70         NaT
2  Alex   3    70       40         NaT
3   Ron   4    70       50         NaT
4  King   5    82       60         NaT
5  Jack   6    30       30  2021-07-27

huangapple
  • 本文由 发表于 2023年7月28日 00:48:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76781892.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定