2023年5月14日 22:35:06go评论95阅读模式

英文:

Insert values from matrix dataframe into another dataframe using indexes

问题

我有一个形状为（4，5）的pandas数据框df_matrix（实际上它要大得多！）。它的行和列上有整数索引。

df_matrix =

	0	1	2	3	4
0	92.576	94.269	108.308	140.394	155.421
1	117.490	104.356	104.952	131.331	144.203
2	115.405	112.536	112.069	116.328	136.226
3	164.047	148.946	133.204	122.235	141.075

df_matrix 包含我想要放入另一个数据框df_main的depth列中的数据，根据它的 df_main['row_i'] 和 df_main['col_i'] 列。因此，预期结果如下：

df_main =

	datetime	latitude	longitude	row_i	col_i	depth
0	2013-05-29 00:39:44	51.708487	104.366131	0	0	92.576
1	2013-05-29 00:40:44	51.708268	104.362324	0	2	108.308
2	2013-05-29 00:41:44	51.708036	104.358530	1	3	133.331
...	...	...	...	...	...	...
296448	2022-06-14 03:39:40	51.876903	105.520172	3	4	141.075

我通过 iterrows 做出了决定：

for index, row in df_main.iterrows():
    df_main.loc[index, 'depth'] = df_matrix.loc[row['row_i'], row['col_i']]

在处理 200000+ 行时花了很多时间。我相信pandas有适当的方法，也许是 merge，但我不知道是哪个以及如何使用它。有没有更具Python特色（pandasonic）的解决方案？

英文:

I have pandas dataframe df_matrix with shape = (4, 5) (Actually it much much much bigger!). It has integer numbers as indexes over rows and columns.

df_matrix =

	0	1	2	3	4
0	92.576	94.269	108.308	140.394	155.421
1	117.490	104.356	104.952	131.331	144.203
2	115.405	112.536	112.069	116.328	136.226
3	164.047	148.946	133.204	122.235	141.075

df_matrix contains data, which I want to put into column depth of another dataframe df_main, according its df_main['row_i'] and df_main['col_i'] columns
So, expected result is like this:

df_main =

	datetime	latitude	longitude	row_i	col_i	depth
0	2013-05-29 00:39:44	51.708487	104.366131	0	0	92.576
1	2013-05-29 00:40:44	51.708268	104.362324	0	2	108.308
2	2013-05-29 00:41:44	51.708036	104.358530	1	3	133.331
...	...	...	...	...	...	...
296448	2022-06-14 03:39:40	51.876903	105.520172	3	4	141.075

I decided it by iterrows:

for index, row in df_main.iterrows():
    df_main.loc[index, &#39;depth&#39;] = df_matrix.loc[row[&#39;row_i&#39;], row[&#39;col_i&#39;]]

it takes a lot of time while handling 200000+ rows. I believe that pandas has appropriate method, may be merge, but i have no idea which one and how to use it.
Is there decision more pythonic (pandasonic)))?

答案1

得分: 0

我不确定 merge 在这里是否有效，但你仍然可以尝试它：

df_main = (
    pd.merge(df_main,
             df_matrix.stack().rename("depth"),
             left_on=["row_i", "col_i"], right_index=True, how="left")
)

输出：

print(df_main)
                   datetime   latitude   longitude  row_i  col_i    depth
0       2013-05-29 00:39:44  51.708487  104.366131      0      0   92.576
1       2013-05-29 00:40:44  51.708268  104.362324      0      2  108.308
2       2013-05-29 00:41:44  51.708036  104.358530      1      3  131.331
296448  2022-06-14 03:39:40  51.876903  105.520172      3      4  141.075

英文:

I'm not sure if merge will be efficient here but you can still try it :

df_main = (
    pd.merge(df_main,
             df_matrix.stack().rename(&quot;depth&quot;),
             left_on=[&quot;row_i&quot;, &quot;col_i&quot;], right_index=True, how=&quot;left&quot;)
)

Output :

print(df_main)
                   datetime   latitude   longitude  row_i  col_i    depth
0       2013-05-29 00:39:44  51.708487  104.366131      0      0   92.576
1       2013-05-29 00:40:44  51.708268  104.362324      0      2  108.308
2       2013-05-29 00:41:44  51.708036  104.358530      1      3  131.331
296448  2022-06-14 03:39:40  51.876903  105.520172      3      4  141.075

答案2

得分: 0

df_main['depth'] = df_matrix.to_numpy()[df_main['row_i'], df_main['col_i']]

英文:

This should work as well:

df_main[&#39;depth&#39;] = df_matrix.to_numpy()[df_main[&#39;row_i&#39;],df_main[&#39;col_i&#39;]]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将矩阵数据框中的值插入到另一个数据框中，使用索引。

问题

答案1

答案2

Extracting data in the same cell locations from multiple excel files into one single excel file

Django Rest Framework在自定义的GET函数中获取经过筛选的查询集。

使用两个数据框基于关键词生成最终数据框。

如何使用Python脚本将点分隔的字符串转换为YAML格式。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。