筛选 pandas 数据框,使用多个不同列的等值检查。

huangapple go评论98阅读模式
英文:

filter a pandas df with multiple check of different column equal

问题

I understand your request. You want to filter the rows in a DataFrame based on the conditions that date1 and date3 are different or date2 and date4 are different. Here's how you can achieve this using pandas in Python:

  1. import pandas as pd
  2. # Assuming you have your data in a DataFrame named 'df'
  3. # Convert date columns to datetime type if they are not already
  4. df['date1'] = pd.to_datetime(df['date1'], format='%d-%m-%Y', errors='coerce')
  5. df['date3'] = pd.to_datetime(df['date3'], format='%d-%m-%Y', errors='coerce')
  6. df['date2'] = pd.to_datetime(df['date2'], format='%d-%m-%Y', errors='coerce')
  7. df['date4'] = pd.to_datetime(df['date4'], format='%d-%m-%Y', errors='coerce')
  8. # Filter rows based on your conditions
  9. filtered_df = df[(df['date1'] != df['date3']) | (df['date2'] != df['date4'])]
  10. # Print the filtered DataFrame
  11. print(filtered_df)

This code first converts the date columns to datetime type and then filters the rows based on your specified conditions. The resulting filtered_df will contain only the rows where date1 and date3 are different or date2 and date4 are different, or both conditions are met.

英文:
  1. ,unique_system_identifier,call_sign,date1,date2,date3,date4
  2. 0,3929436,WQZL268,14-06-2023,,14-06-2023,
  3. 1,3929436,WQZL268,,,,
  4. 2,3929437,WQZL269,14-06-2023,,14-06-2023,
  5. 3,3929437,WQZL269,,,,
  6. 4,3929438,WQZL270,14-06-2023,,14-06-2023,
  7. 5,3929438,WQZL270,,,,
  8. 6,3929439,WQZL271,14-06-2023,,14-06-2023,
  9. 7,3929439,WQZL271,,,,
  10. 8,3929440,WQZL272,14-06-2023,,14-06-2023,
  11. 9,3929440,WQZL272,,,,
  12. 10,3929441,WQZL273,14-06-2023,,14-06-2023,
  13. 11,3929441,WQZL273,,,,
  14. 12,3929442,WQZL274,14-06-2023,,14-06-2023,
  15. 13,3929442,WQZL274,,,,
  16. 14,3929443,WQZL275,14-06-2023,,14-06-2023,

I have a df like above need to take only the values which are date1 & date3 are have different or date2 or date4 have different if both different also need
how to do with pandas,

the columns are coming as pandas objectnote as datetime/string

答案1

得分: 1

你可以先替换缺失的数值,然后进行不等比较,将两个掩码用 | 连接以进行按位“或”操作:

  1. df1 = df.fillna('')
  2. df = df[df1['date1'].ne(df1['date3']) | df1['date2'].ne(df1['date4'])]
  3. print(df)

输出结果为:

  1. Empty DataFrame
  2. Columns: [Unnamed: 0, unique_system_identifier, call_sign, date1, date2, date3, date4]
  3. Index: []
英文:

You can replace missing values first and then compare for not eqaul, chain both mask by | for bitwise OR:

  1. df1 = df.fillna('')
  2. df = df[df1['date1'].ne(df1['date3']) | df1['date2'].ne(df1['date4'])]
  3. print (df)
  4. Empty DataFrame
  5. Columns: [Unnamed: 0, unique_system_identifier, call_sign, date1, date2, date3, date4]
  6. Index: []

huangapple
  • 本文由 发表于 2023年2月8日 13:50:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75381812.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定