2023年3月9日 19:54:15go评论82阅读模式

英文:

replacing specified columns in one dataframe in pandas/python with another column

问题

我有一个包含大约800,000行信息和多列的Excel文件。某些列的信息将会更改，我需要在原始数据库/Excel文件中进行更新。第二个数据库/Excel文件包含与第一个数据库对应的识别信息（并且它将包含新更新的信息）。第二个数据库/Excel文件的信息行数和列数都不如第一个数据库多（因为它只提供已更改的信息）。我的目标是识别已更改的信息，然后使用第二个文件中的更新信息来更新原始数据库/Excel文件。我遇到了困难：

if inventory['marker'].equals(inventory2['marker']):
    inventory['retail'] = inventory.replace(inventory['retail'], inventory2['retail'])

原始的库存文件包含了800,000行信息和多列。库存2文件是包含新更改信息的数据集。标记特征是用来链接这两个数据库的（我可以通过标记来判断哪些信息已更改）。但出于某种原因，这行代码未能更改原始数据库（库存）中的信息。所以，如果来自两个数据库的标记匹配（这意味着信息已更改，因为第二个数据库仅包含已更改的标记项目），那么我想用库存2中的价格信息替换原始库存的价格信息（价格信息通常是已更改的内容）。原始数据库（库存）中的所有其他列信息仍将保持不变，然后我可以导出它，得到一个包含原始数据库（库存文件）中更新信息的新文件。

非常感谢任何帮助。

英文:

I have an excel file that has about 800,000 rows of information with several columns. Some of the column information will change and I have to update it in the original database/excel file. The second database/excel file has identifying information that goes corresponds with the first database (and it will have the newly updated information). The second database/excel file does not have as many rows or columns of information as the first database (because it only gives the information which has changed). My goal is to identify the information that has changed and then update the original database/excel file with the updated information from the second file. I have struggled with this:

 if inventory[&#39;marker&#39;].equals(inventory2[&#39;marker&#39;]):
       inventory[&#39;retail&#39;] = inventory.replace(inventory[&#39;retail&#39;], inventory2[&#39;retail&#39;])

the original inventory file has 800,000 rows of information and several columns. The inventory2 file is the dataset that has the newly changed information. The marker characteristic is what is used to link the two databases (I can tell what has changed by marker). For some reason, this line of code is not changing the information in the original database (inventory). So, IF the marker matches from both databases (that means the information has changed because the second database only contains the marker items which have changed), then I'd like to replace the original inventory price information with inventory2 price information. (The price information is usually what has been changed.) It still would keep all of the other columns of information intact in the original database (inventory), and I can export it and have a new file with the updated information in the original database (inventory file).

Some help would be greatly appreciated.

答案1

得分: 1

我认为这个代码满足您的需求：

import pandas as pd

inventory1 = pd.DataFrame({'marker':['aa','bb','cc','dd'], 'col1': [5, 6, 3, 8], 'col2': [1, 3, 5, 9]})
inventory2 = pd.DataFrame({'marker':['bb','dd'], 'col1': [243, 844], 'col2': [335, 9333]})

for idx, row in inventory2.iterrows():
    inventory1.loc[inventory1['marker'] == row['marker']] = row.values

print(inventory1)

输出

  marker  col1  col2
0     aa     5     1
1     bb   243   335
2     cc     3     5
3     dd   844  9333

英文:

I think it does what you need:

import pandas as pd

inventory1 = pd.DataFrame({&#39;marker&#39;:[&#39;aa&#39;,&#39;bb&#39;,&#39;cc&#39;,&#39;dd&#39;], &#39;col1&#39;: [5, 6, 3, 8], &#39;col2&#39;: [1,3,5,9]})
inventory2 = pd.DataFrame({&#39;marker&#39;:[&#39;bb&#39;,&#39;dd&#39;], &#39;col1&#39;: [ 243, 844], &#39;col2&#39;: [335,9333]})

for idx, row in inventory2.iterrows():
    inventory1.loc[inventory1[&#39;marker&#39;] == row[&#39;marker&#39;]] = row.values


print (inventory1)

OUTPUT

  marker  col1  col2
0     aa     5     1
1     bb   243   335
2     cc     3     5
3     dd   844  9333

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用Pandas/Python将一个DataFrame中指定的列替换为另一个列

问题

答案1

输出

OUTPUT

为什么Numba并行比普通的Python循环慢？

Argument的名称遮蔽了Python argparse模块中的关键字。

tokenizer.push_to_hub(repo_name) is not working.

使用Try Except循环写入文件总是导致except块。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论