使用Pandas/Python将一个DataFrame中指定的列替换为另一个列

huangapple go评论82阅读模式
英文:

replacing specified columns in one dataframe in pandas/python with another column

问题

我有一个包含大约800,000行信息和多列的Excel文件。某些列的信息将会更改,我需要在原始数据库/Excel文件中进行更新。第二个数据库/Excel文件包含与第一个数据库对应的识别信息(并且它将包含新更新的信息)。第二个数据库/Excel文件的信息行数和列数都不如第一个数据库多(因为它只提供已更改的信息)。我的目标是识别已更改的信息,然后使用第二个文件中的更新信息来更新原始数据库/Excel文件。我遇到了困难:

if inventory['marker'].equals(inventory2['marker']):
    inventory['retail'] = inventory.replace(inventory['retail'], inventory2['retail'])

原始的库存文件包含了800,000行信息和多列。库存2文件是包含新更改信息的数据集。标记特征是用来链接这两个数据库的(我可以通过标记来判断哪些信息已更改)。但出于某种原因,这行代码未能更改原始数据库(库存)中的信息。所以,如果来自两个数据库的标记匹配(这意味着信息已更改,因为第二个数据库仅包含已更改的标记项目),那么我想用库存2中的价格信息替换原始库存的价格信息(价格信息通常是已更改的内容)。原始数据库(库存)中的所有其他列信息仍将保持不变,然后我可以导出它,得到一个包含原始数据库(库存文件)中更新信息的新文件。

非常感谢任何帮助。

英文:

I have an excel file that has about 800,000 rows of information with several columns. Some of the column information will change and I have to update it in the original database/excel file. The second database/excel file has identifying information that goes corresponds with the first database (and it will have the newly updated information). The second database/excel file does not have as many rows or columns of information as the first database (because it only gives the information which has changed). My goal is to identify the information that has changed and then update the original database/excel file with the updated information from the second file. I have struggled with this:

 if inventory['marker'].equals(inventory2['marker']):
       inventory['retail'] = inventory.replace(inventory['retail'], inventory2['retail'])

the original inventory file has 800,000 rows of information and several columns. The inventory2 file is the dataset that has the newly changed information. The marker characteristic is what is used to link the two databases (I can tell what has changed by marker). For some reason, this line of code is not changing the information in the original database (inventory). So, IF the marker matches from both databases (that means the information has changed because the second database only contains the marker items which have changed), then I'd like to replace the original inventory price information with inventory2 price information. (The price information is usually what has been changed.) It still would keep all of the other columns of information intact in the original database (inventory), and I can export it and have a new file with the updated information in the original database (inventory file).

Some help would be greatly appreciated.

答案1

得分: 1

我认为这个代码满足您的需求:

import pandas as pd

inventory1 = pd.DataFrame({'marker':['aa','bb','cc','dd'], 'col1': [5, 6, 3, 8], 'col2': [1, 3, 5, 9]})
inventory2 = pd.DataFrame({'marker':['bb','dd'], 'col1': [243, 844], 'col2': [335, 9333]})

for idx, row in inventory2.iterrows():
    inventory1.loc[inventory1['marker'] == row['marker']] = row.values

print(inventory1)

输出

  marker  col1  col2
0     aa     5     1
1     bb   243   335
2     cc     3     5
3     dd   844  9333
英文:

I think it does what you need:

import pandas as pd

inventory1 = pd.DataFrame({'marker':['aa','bb','cc','dd'], 'col1': [5, 6, 3, 8], 'col2': [1,3,5,9]})
inventory2 = pd.DataFrame({'marker':['bb','dd'], 'col1': [ 243, 844], 'col2': [335,9333]})

for idx, row in inventory2.iterrows():
    inventory1.loc[inventory1['marker'] == row['marker']] = row.values


print (inventory1)

OUTPUT

  marker  col1  col2
0     aa     5     1
1     bb   243   335
2     cc     3     5
3     dd   844  9333

huangapple
  • 本文由 发表于 2023年3月9日 19:54:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/75684259.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定