2023年3月10日 01:45:40go评论142阅读模式

英文:

PySpark: Update column values from dataframe A with dataframe B's values with matching ID

问题

假设我们有dfA：

ID	Scores
A	20
A	40
A	60
B	10
B	90

和dfB：

ID	Scores
A	60
B	90

期望的输出：

ID	Scores
A	60
A	60
A	60
B	90
B	90

如何在 PySpark 中根据匹配的 ID 更新 dfA 的分数列与 dfB 的分数列相符？

英文:

Assume we have dfA:

ID	Scores
A	20
A	40
A	60
B	10
B	90

and dfB:

ID	Scores
A	60
B	90

Expected OUTPUT:

ID	Scores
A	60
A	60
A	60
B	90
B	90

How can I update the score column in dfA with dfB's score according to matching ID in PySpark?

答案1

得分: 1

从 df_1 中将列名 Scores 重命名为 old_scores。
使用内连接来匹配这两个数据框，使用公共键列。
从 df_1 中删除 old_scores 列。

输出结果如下：

+---+------+
| ID|Scores|
+---+------+
|  A|    60|
|  A|    60|
|  A|    60|
|  B|    90|
|  B|    90|
+---+------+

英文:

Your DataFrames

df_1
+---+------+
| ID|Scores|
+---+------+
|  A|    20|
|  A|    40|
|  A|    60|
|  B|    10|
|  B|    90|
+---+------+

df_2
+---+------+
| ID|Scores|
+---+------+
|  A|    60|
|  B|    90|
+---+------+

Rename the Scores column name to old_scores from df_1 before joining.

df_1 = df_1.withColumnRenamed(&quot;Scores&quot;, &quot;old_scores&quot;)

Use inner join to match the two DataFrames using the common key column.

df = df_1.join(df_2, &quot;ID&quot;)

Drop the old_scores column from df_1

df.drop(&quot;old_scores&quot;).show()

Output:

+---+------+
| ID|Scores|
+---+------+
|  A|    60|
|  A|    60|
|  A|    60|
|  B|    90|
|  B|    90|
+---+------+

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用PySpark：从具有匹配ID的数据框B的值中更新数据框A的列值。

问题

答案1

tf.keras.utils.Sequence在批次大小小于最后一个批次时会忽略最后一个批次。

将生成的JavaScript文件中的”\n”替换为换行符。

如何在Python中使用datetime进行减法操作

从数据框逐行或按块选择最大/最小值

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论