如何交换包含列表且大小不同的两个Pandas数据帧中的行?

huangapple go评论107阅读模式
英文:

How to swap rows in 2 pandas dataframes which contain lists and have different size?

问题

You can achieve this in Python using the Pandas library. Here's the code to swap the second element of each row of df1 with the second element of each row of df2, considering the differing number of rows:

  1. import pandas as pd
  2. # Your df1 and df2 dataframes
  3. # Create a dictionary to store the new rows for df1
  4. new_rows_df1 = {}
  5. # Iterate through the rows of df2 and update df1 accordingly
  6. for index, row in df2.iterrows():
  7. if index in df1.index:
  8. new_row = df1.loc[index].copy() # Copy the row from df1
  9. new_row['Path'][1] = row['Path'][1] # Swap the second element
  10. new_rows_df1[index] = new_row
  11. # Concatenate the updated rows with the original df1
  12. updated_df1 = pd.concat([df1, pd.DataFrame.from_dict(new_rows_df1, orient='index')])
  13. # Sort the index to maintain the original order
  14. updated_df1.sort_index(inplace=True)
  15. # Your updated df1 and df2 are now in 'updated_df1' and 'df2'
  16. print(updated_df1)
  17. print(df2)

This code will give you the desired output with swapped second elements in df1 while maintaining the rows that don't have corresponding rows in df2.

英文:

I have two Pandas dataframes, df1 and df2. Each dataframe has one column named 'Path'. Each row has a list. They are like this:

df1

  1. Path
  2. [OAK, ORD, FLL, PBG]
  3. [OAK, SEA, FLL, PBG]
  4. [OAK, AUS, FLL, PBG]
  5. [OAK, LAS, FLL, PBG]
  6. [OAK, LAX, FLL, PBG]
  7. [OAK, DAL, FLL, PBG]
  8. [OAK, MDW, FLL, PBG]
  9. [OAK, BWI, FLL, PBG]

The df1 constructor is:

  1. {'Path': {0: ['OAK', 'ORD', 'FLL', 'PBG'], 2: ['OAK', 'SEA', 'FLL', 'PBG'], 4: ['OAK', 'AUS', 'FLL', 'PBG'], 6: ['OAK', 'LAS', 'FLL', 'PBG'], 8: ['OAK', 'LAX', 'FLL', 'PBG'], 10: ['OAK', 'DAL', 'FLL', 'PBG'], 12: ['OAK', 'MDW', 'FLL', 'PBG'], 14: ['OAK', 'BWI', 'FLL', 'PBG']}}

df2

  1. Path
  2. [OAK, DFW, FLL, PBG]
  3. [OAK, JFK, FLL, PBG]
  4. [OAK, MCI, FLL, PBG]
  5. [OAK, PHX, FLL, PBG]
  6. [OAK, DEN, FLL, PBG]
  7. [OAK, HOU, FLL, PBG]
  8. [OAK, ATL, FLL, PBG]

The df2 constructor is:

  1. {'Path': {1: ['OAK', 'DFW', 'FLL', 'PBG'], 3: ['OAK', 'JFK', 'FLL', 'PBG'], 5: ['OAK', 'MCI', 'FLL', 'PBG'], 7: ['OAK', 'PHX', 'FLL', 'PBG'], 9: ['OAK', 'DEN', 'FLL', 'PBG'], 11: ['OAK', 'HOU', 'FLL', 'PBG'], 13: ['OAK', 'ATL', 'FLL', 'PBG']}}

One problem is that I have a different number of rows in my dataframes. I would like to swap the second element of each row of df1 with the second element of each row of df2. If there is no corresponding row, the row should not be modified or dropped. The desired output is:

df1

  1. Path
  2. [OAK, DFW, FLL, PBG]
  3. [OAK, JFK, FLL, PBG]
  4. [OAK, MCI, FLL, PBG]
  5. [OAK, PHX, FLL, PBG]
  6. [OAK, DEN, FLL, PBG]
  7. [OAK, HOU, FLL, PBG]
  8. [OAK, ATL, FLL, PBG]
  9. [OAK, BWI, FLL, PBG]

df2

  1. Path
  2. [OAK, ORD, FLL, PBG]
  3. [OAK, SEA, FLL, PBG]
  4. [OAK, AUS, FLL, PBG]
  5. [OAK, LAS, FLL, PBG]
  6. [OAK, LAX, FLL, PBG]
  7. [OAK, DAL, FLL, PBG]
  8. [OAK, MDW, FLL, PBG]

How can I do it in Python?

答案1

得分: 2

你可以在将列表系列转换为数据帧后,在此处使用combine_first()

  1. n = pd.DataFrame(df2['Path'].tolist())
  2. m = pd.DataFrame(df1['Path'].tolist())
  3. # ----------------------------------------------------
  4. df1_final = n[[1]].combine_first(m).dropna().agg(list, 1)
  5. df2_final = m[[1]].combine_first(n).dropna().agg(list, 1)
  1. print(df1_final)
  2. print('\n')
  3. print(df2_final)

结果如下:

  1. 0 [OAK, DFW, FLL, PBG]
  2. 1 [OAK, JFK, FLL, PBG]
  3. 2 [OAK, MCI, FLL, PBG]
  4. 3 [OAK, PHX, FLL, PBG]
  5. 4 [OAK, DEN, FLL, PBG]
  6. 5 [OAK, HOU, FLL, PBG]
  7. 6 [OAK, ATL, FLL, PBG]
  8. 7 [OAK, BWI, FLL, PBG]
  9. dtype: object
  10. 0 [OAK, ORD, FLL, PBG]
  11. 1 [OAK, SEA, FLL, PBG]
  12. 2 [OAK, AUS, FLL, PBG]
  13. 3 [OAK, LAS, FLL, PBG]
  14. 4 [OAK, LAX, FLL, PBG]
  15. 5 [OAK, DAL, FLL, PBG]
  16. 6 [OAK, MDW, FLL, PBG]
  17. dtype: object
英文:

You can use combine_first() here after converting the series of list into a dataframe:

  1. n=pd.DataFrame(df2['Path'].tolist())
  2. m=pd.DataFrame(df1['Path'].tolist())
  3. #----------------------------------------------------
  4. df1_final=n[[1]].combine_first(m).dropna().agg(list,1)
  5. df2_final=m[[1]].combine_first(n).dropna().agg(list,1)

  1. print(df1_final)
  2. print('\n')
  3. print(df2_final)
  4. 0 [OAK, DFW, FLL, PBG]
  5. 1 [OAK, JFK, FLL, PBG]
  6. 2 [OAK, MCI, FLL, PBG]
  7. 3 [OAK, PHX, FLL, PBG]
  8. 4 [OAK, DEN, FLL, PBG]
  9. 5 [OAK, HOU, FLL, PBG]
  10. 6 [OAK, ATL, FLL, PBG]
  11. 7 [OAK, BWI, FLL, PBG]
  12. dtype: object
  13. 0 [OAK, ORD, FLL, PBG]
  14. 1 [OAK, SEA, FLL, PBG]
  15. 2 [OAK, AUS, FLL, PBG]
  16. 3 [OAK, LAS, FLL, PBG]
  17. 4 [OAK, LAX, FLL, PBG]
  18. 5 [OAK, DAL, FLL, PBG]
  19. 6 [OAK, MDW, FLL, PBG]
  20. dtype: object

huangapple
  • 本文由 发表于 2020年1月3日 22:56:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/59580730.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定