筛选 pandas 数据框,使用多个不同列的等值检查。

huangapple go评论65阅读模式
英文:

filter a pandas df with multiple check of different column equal

问题

I understand your request. You want to filter the rows in a DataFrame based on the conditions that date1 and date3 are different or date2 and date4 are different. Here's how you can achieve this using pandas in Python:

import pandas as pd

# Assuming you have your data in a DataFrame named 'df'
# Convert date columns to datetime type if they are not already
df['date1'] = pd.to_datetime(df['date1'], format='%d-%m-%Y', errors='coerce')
df['date3'] = pd.to_datetime(df['date3'], format='%d-%m-%Y', errors='coerce')
df['date2'] = pd.to_datetime(df['date2'], format='%d-%m-%Y', errors='coerce')
df['date4'] = pd.to_datetime(df['date4'], format='%d-%m-%Y', errors='coerce')

# Filter rows based on your conditions
filtered_df = df[(df['date1'] != df['date3']) | (df['date2'] != df['date4'])]

# Print the filtered DataFrame
print(filtered_df)

This code first converts the date columns to datetime type and then filters the rows based on your specified conditions. The resulting filtered_df will contain only the rows where date1 and date3 are different or date2 and date4 are different, or both conditions are met.

英文:
,unique_system_identifier,call_sign,date1,date2,date3,date4
0,3929436,WQZL268,14-06-2023,,14-06-2023,
1,3929436,WQZL268,,,,
2,3929437,WQZL269,14-06-2023,,14-06-2023,
3,3929437,WQZL269,,,,
4,3929438,WQZL270,14-06-2023,,14-06-2023,
5,3929438,WQZL270,,,,
6,3929439,WQZL271,14-06-2023,,14-06-2023,
7,3929439,WQZL271,,,,
8,3929440,WQZL272,14-06-2023,,14-06-2023,
9,3929440,WQZL272,,,,
10,3929441,WQZL273,14-06-2023,,14-06-2023,
11,3929441,WQZL273,,,,
12,3929442,WQZL274,14-06-2023,,14-06-2023,
13,3929442,WQZL274,,,,
14,3929443,WQZL275,14-06-2023,,14-06-2023,

I have a df like above need to take only the values which are date1 & date3 are have different or date2 or date4 have different if both different also need
how to do with pandas,

the columns are coming as pandas objectnote as datetime/string

答案1

得分: 1

你可以先替换缺失的数值,然后进行不等比较,将两个掩码用 | 连接以进行按位“或”操作:

df1 = df.fillna('')
df = df[df1['date1'].ne(df1['date3']) | df1['date2'].ne(df1['date4'])]
print(df)

输出结果为:

Empty DataFrame
Columns: [Unnamed: 0, unique_system_identifier, call_sign, date1, date2, date3, date4]
Index: []
英文:

You can replace missing values first and then compare for not eqaul, chain both mask by | for bitwise OR:

df1 = df.fillna('')
df = df[df1['date1'].ne(df1['date3']) | df1['date2'].ne(df1['date4'])]
print (df)
Empty DataFrame
Columns: [Unnamed: 0, unique_system_identifier, call_sign, date1, date2, date3, date4]
Index: []

huangapple
  • 本文由 发表于 2023年2月8日 13:50:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75381812.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定