根据特定日期条件更新另一列中的值。

huangapple go评论63阅读模式
英文:

Updating values in another column based on certain date conditions

问题

I want to update the "Age" column to 0 for dates in the "DOB" column that are greater than or equal to 1900. Here are the corrected code snippets:

# Using pandas
import pandas as pd

df.loc[df['DOB'] >= '1900-01-01', 'Age'] = 0

# Using numpy
import numpy as np

df['Age'] = np.where(df['DOB'] >= '1900-01-01', 0, df['Age'])

These code snippets will correctly update the "Age" column to 0 for dates in the "DOB" column that are greater than or equal to 1900, as shown in your expected output.

英文:

I have a dataframe which has datetime datatype columns which DOB and another column Age , and here i want to update Age column to 0 for the DOB junk values i..e in DOB column there are dates which are 01-01-1900 date time .Hence dates above and equal to 1900 i want to update in the Age column as 0 .

DOB: datetime64[ns]
Age: Object
Tried below codes , however the expected result isn't correct

Tried below codes , not sure what is going wrong

df['Age'].mask(df["DOB"]>="01-01-1900" ,'0', inplace=True)

or another code tried

df['Age'] = np.where(df["DOB"]>="01-01-1900", 0,
df['DOB'])

With the above codes all the DOB dates are getting incorrectly updated alng with Age column as 0 for incorrect dates.

The update of age to 0 should be done only for dates WITH 1900 in DOB

Original dataframe:

DOB          AGE
2004-05-17   18
1900-01-01   141
1900-02-01   135

Expected output:

DOB          AGE
2004-05-17   18
1900-01-01   0
1900-02-01   0

答案1

得分: 2

If need compare datetime column vy another dates use format YYYY-MM-DD - default datetime format in python:

#if need compare only year for less or equal 1900
df['AGE'] = np.where(df["DOB"].dt.year <= 1900, 0, df['AGE'])

df['AGE'] = np.where(df["DOB"] < '1901-01-01', 0, df['AGE'])

df.loc[df["DOB"].dt.year.le(1900), 'AGE'] = 0

print (df)
         DOB  AGE
0 2004-05-17   18
1 1900-01-01    0
2 1900-02-01    0

#if need compare datetime need YYYY-MM-DD format
df['AGE'] = np.where(df["DOB"].eq('1900-01-01'), 0, df['AGE'])
print (df)
DOB AGE
0 2004-05-17 18
1 1900-01-01 0
2 1900-02-01 135


<details>
<summary>英文:</summary>

If need compare datetime column vy another dates use format `YYYY-MM-DD` - default datetime format in python:

    #if need compare only year for less or equal 1900
    df[&#39;AGE&#39;] = np.where(df[&quot;DOB&quot;].dt.year &lt;= 1900, 0, df[&#39;AGE&#39;])

    df[&#39;AGE&#39;] = np.where(df[&quot;DOB&quot;] &lt; &#39;1901-01-01&#39;, 0, df[&#39;AGE&#39;])

    df.loc[df[&quot;DOB&quot;].dt.year.le(1900), &#39;AGE&#39;] = 0

    print (df)
             DOB  AGE
    0 2004-05-17   18
    1 1900-01-01    0
    2 1900-02-01    0

---

    #if need compare datetime need YYYY-MM-DD format
    df[&#39;AGE&#39;] = np.where(df[&quot;DOB&quot;].eq(&#39;1900-01-01&#39;), 0, df[&#39;AGE&#39;])
    print (df)
             DOB  AGE
    0 2004-05-17   18
    1 1900-01-01    0
    2 1900-02-01  135


</details>



# 答案2
**得分**: 0

以下是翻译好的内容:

代码部分不要翻译,只返回翻译好的部分:

你的代码问题在于,DOB列中的日期与字符串"01-01-1900"的比较不按预期工作,因为DOB列是datetime类型,而比较是与字符串进行的。另外,你的第二段代码片段将Age列中的值替换为DOB列,这不是期望的行为。

要将Age列更新为DOB列中大于或等于1900年的日期为0,你可以使用以下代码:

df.loc[df['DOB'] >= '1900-01-01', 'Age'] = 0


这段代码将在DOB列大于或等于"1900-01-01"的所有行中将Age列更新为0。使用`loc`方法选择满足条件的行并更新Age列。

在运行代码之前,请确保DOB列的数据类型为datetime。你可以使用`df['DOB'] = pd.to_datetime(df['DOB'])`将该列转换为datetime类型。

希望这对你有帮助!

<details>
<summary>英文:</summary>

The issue with your code is that the comparison of the date in the DOB column with the string &quot;01-01-1900&quot; is not working as expected because the DOB column is of datetime type and the comparison is done with a string. In addition, your second code snippet is replacing the values in the Age column with the DOB column, which is not the desired behavior.

To update the Age column to 0 for dates greater than or equal to 1900 in the DOB column, you can use the following code:

df.loc[df['DOB'] >= '1900-01-01', 'Age'] = 0


This code will update the Age column to 0 for all rows where the DOB column is greater than or equal to &quot;1900-01-01&quot;. The `loc` method is used to select the rows that meet the condition and update the Age column.


Make sure that the DOB column is of datetime type before running the code. You can convert the column to datetime using `df[&#39;DOB&#39;] = pd.to_datetime(df[&#39;DOB&#39;])`.

I hope this helps! 

</details>



huangapple
  • 本文由 发表于 2023年4月17日 15:59:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76032875.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定