英文:
filter a pandas df with multiple check of different column equal
问题
I understand your request. You want to filter the rows in a DataFrame based on the conditions that date1 and date3 are different or date2 and date4 are different. Here's how you can achieve this using pandas in Python:
import pandas as pd
# Assuming you have your data in a DataFrame named 'df'
# Convert date columns to datetime type if they are not already
df['date1'] = pd.to_datetime(df['date1'], format='%d-%m-%Y', errors='coerce')
df['date3'] = pd.to_datetime(df['date3'], format='%d-%m-%Y', errors='coerce')
df['date2'] = pd.to_datetime(df['date2'], format='%d-%m-%Y', errors='coerce')
df['date4'] = pd.to_datetime(df['date4'], format='%d-%m-%Y', errors='coerce')
# Filter rows based on your conditions
filtered_df = df[(df['date1'] != df['date3']) | (df['date2'] != df['date4'])]
# Print the filtered DataFrame
print(filtered_df)
This code first converts the date columns to datetime type and then filters the rows based on your specified conditions. The resulting filtered_df
will contain only the rows where date1 and date3 are different or date2 and date4 are different, or both conditions are met.
英文:
,unique_system_identifier,call_sign,date1,date2,date3,date4
0,3929436,WQZL268,14-06-2023,,14-06-2023,
1,3929436,WQZL268,,,,
2,3929437,WQZL269,14-06-2023,,14-06-2023,
3,3929437,WQZL269,,,,
4,3929438,WQZL270,14-06-2023,,14-06-2023,
5,3929438,WQZL270,,,,
6,3929439,WQZL271,14-06-2023,,14-06-2023,
7,3929439,WQZL271,,,,
8,3929440,WQZL272,14-06-2023,,14-06-2023,
9,3929440,WQZL272,,,,
10,3929441,WQZL273,14-06-2023,,14-06-2023,
11,3929441,WQZL273,,,,
12,3929442,WQZL274,14-06-2023,,14-06-2023,
13,3929442,WQZL274,,,,
14,3929443,WQZL275,14-06-2023,,14-06-2023,
I have a df like above need to take only the values which are date1 & date3 are have different or date2 or date4 have different if both different also need
how to do with pandas,
the columns are coming as pandas objectnote as datetime/string
答案1
得分: 1
你可以先替换缺失的数值,然后进行不等比较,将两个掩码用 |
连接以进行按位“或”操作:
df1 = df.fillna('')
df = df[df1['date1'].ne(df1['date3']) | df1['date2'].ne(df1['date4'])]
print(df)
输出结果为:
Empty DataFrame
Columns: [Unnamed: 0, unique_system_identifier, call_sign, date1, date2, date3, date4]
Index: []
英文:
You can replace missing values first and then compare for not eqaul, chain both mask by |
for bitwise OR
:
df1 = df.fillna('')
df = df[df1['date1'].ne(df1['date3']) | df1['date2'].ne(df1['date4'])]
print (df)
Empty DataFrame
Columns: [Unnamed: 0, unique_system_identifier, call_sign, date1, date2, date3, date4]
Index: []
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论