英文:
replacing specified columns in one dataframe in pandas/python with another column
问题
我有一个包含大约800,000行信息和多列的Excel文件。某些列的信息将会更改,我需要在原始数据库/Excel文件中进行更新。第二个数据库/Excel文件包含与第一个数据库对应的识别信息(并且它将包含新更新的信息)。第二个数据库/Excel文件的信息行数和列数都不如第一个数据库多(因为它只提供已更改的信息)。我的目标是识别已更改的信息,然后使用第二个文件中的更新信息来更新原始数据库/Excel文件。我遇到了困难:
if inventory['marker'].equals(inventory2['marker']):
inventory['retail'] = inventory.replace(inventory['retail'], inventory2['retail'])
原始的库存文件包含了800,000行信息和多列。库存2文件是包含新更改信息的数据集。标记特征是用来链接这两个数据库的(我可以通过标记来判断哪些信息已更改)。但出于某种原因,这行代码未能更改原始数据库(库存)中的信息。所以,如果来自两个数据库的标记匹配(这意味着信息已更改,因为第二个数据库仅包含已更改的标记项目),那么我想用库存2中的价格信息替换原始库存的价格信息(价格信息通常是已更改的内容)。原始数据库(库存)中的所有其他列信息仍将保持不变,然后我可以导出它,得到一个包含原始数据库(库存文件)中更新信息的新文件。
非常感谢任何帮助。
英文:
I have an excel file that has about 800,000 rows of information with several columns. Some of the column information will change and I have to update it in the original database/excel file. The second database/excel file has identifying information that goes corresponds with the first database (and it will have the newly updated information). The second database/excel file does not have as many rows or columns of information as the first database (because it only gives the information which has changed). My goal is to identify the information that has changed and then update the original database/excel file with the updated information from the second file. I have struggled with this:
if inventory['marker'].equals(inventory2['marker']):
inventory['retail'] = inventory.replace(inventory['retail'], inventory2['retail'])
the original inventory file has 800,000 rows of information and several columns. The inventory2 file is the dataset that has the newly changed information. The marker characteristic is what is used to link the two databases (I can tell what has changed by marker). For some reason, this line of code is not changing the information in the original database (inventory). So, IF the marker matches from both databases (that means the information has changed because the second database only contains the marker items which have changed), then I'd like to replace the original inventory price information with inventory2 price information. (The price information is usually what has been changed.) It still would keep all of the other columns of information intact in the original database (inventory), and I can export it and have a new file with the updated information in the original database (inventory file).
Some help would be greatly appreciated.
答案1
得分: 1
我认为这个代码满足您的需求:
import pandas as pd
inventory1 = pd.DataFrame({'marker':['aa','bb','cc','dd'], 'col1': [5, 6, 3, 8], 'col2': [1, 3, 5, 9]})
inventory2 = pd.DataFrame({'marker':['bb','dd'], 'col1': [243, 844], 'col2': [335, 9333]})
for idx, row in inventory2.iterrows():
inventory1.loc[inventory1['marker'] == row['marker']] = row.values
print(inventory1)
输出
marker col1 col2
0 aa 5 1
1 bb 243 335
2 cc 3 5
3 dd 844 9333
英文:
I think it does what you need:
import pandas as pd
inventory1 = pd.DataFrame({'marker':['aa','bb','cc','dd'], 'col1': [5, 6, 3, 8], 'col2': [1,3,5,9]})
inventory2 = pd.DataFrame({'marker':['bb','dd'], 'col1': [ 243, 844], 'col2': [335,9333]})
for idx, row in inventory2.iterrows():
inventory1.loc[inventory1['marker'] == row['marker']] = row.values
print (inventory1)
OUTPUT
marker col1 col2
0 aa 5 1
1 bb 243 335
2 cc 3 5
3 dd 844 9333
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论