2023年6月26日 06:43:42go评论88阅读模式

英文:

Conditional Left Joining in dplyr

问题

dat1 需要与 dat2 进行左连接，连接条件是 ID，并且选择 dat2 中具有最小 Veh 和 Pert 值的行。最终的数据将如下所示：

    ID    Per Gu Ta Veh Pert  Ti    ID1
1 1123 112301 14 13   1    1 100 11231
2 1124 112401 14 19   1    1 107 11241
3 1125 112501 29 25   2    2 118 11251
4 1126 112601 22 20   3    1 112 11268

在 dat2 的第8行中，Per 的最小值为 1，但 Veh 为 4，所以选择了 Veh 为最小值。

英文:

I have two datasets.

Dataset 1 looks like the following:

dat1 &lt;- read.table(header=TRUE, text=&quot;
ID  Per  Gu  Ta
1123    112301  14  13
                   1124 112401  14  19
                   1125 112501 29  25
                   1126 112601  22  20
                   &quot;)
dat1
    ID    Per Gu Ta
1 1123 112301 14 13
2 1124 112401 14 19
3 1125 112501 29 25
4 1126 112601 22 20

Dataset 2 looks like the following:

dat2 &lt;- read.table(header=TRUE, text=&quot;
ID  Veh  Pert  Ti  ID1
1123    1 1 100 11231
                   1123 2 1 110 11232
                   1124 1 1 107 11241
                   1124 2 1 111 11242
                   1124 3 2 109 11243
                   1125 2 2 118 11251
                   1125 3 3 113 11252
                   1125 4 1 108 11253
                   1126 3 4 119 11265
                   1126 3 1 112 11268
                   &quot;)
dat2
     ID Veh Pert  Ti   ID1
1  1123   1    1 100 11231
2  1123   2    1 110 11232
3  1124   1    1 107 11241
4  1124   2    1 111 11242
5  1124   3    2 109 11243
6  1125   2    2 118 11251
7  1125   3    2 113 11252
8  1125   4    1 108 11253
9  1126   3    4 119 11265
10 1126   3    1 112 11268

dat1 is needed to be left joined with dat2 by ID with rows having the minimum of at first Veh and then Pert of dat2. The final data will be like the following:

ID   Per    Gu Ta Veh  Pert  Ti  ID1
1123 112301 14 13 1     1    100 11231
1124 112401 14 19 1     1    107 11241
1125 112501 29 25 2     2    118 11251   ### in `row 8` of `dat2` the min value of `Per` is `1` but `Veh` is `4`;
1126 112601 22 20 3     1    112 11268

答案1

得分: 1

我是你的中文翻译，以下是已翻译好的代码部分：

# 我是你的中文翻译，以下是已翻译好的代码部分：
# 步骤1：使用left_join然后根据Veh和Pert进行arrange排序
# 步骤2：按ID分组并选择第一个值
# 加载dplyr库
library(dplyr)
# 创建dat1数据框
dat1 <- read.table(
  header = TRUE,
  text = "
ID  Per  Gu  Ta
1123    112301  14  13
1124 112401  14  19
1125 112501 29  25
1126 112601  22  20
"
)
# 创建dat2数据框
dat2 <- read.table(
  header = TRUE,
  text = "
ID  Veh  Pert  Ti  ID1
1123    1 1 100 11231
1123    2 1 110 11232
1124    1 1 107 11241
1124    2 1 111 11242
1124    3 2 109 11243
1125    2 2 118 11251
1125    3 3 113 11252
1125    4 1 108 11253
1126    3 4 119 11265
1126    3 1 112 11268
"
)
# 左连接或按Veh然后Pert排序
left_joined_df <-
  dat1 %>% left_join(dat2, by = c("ID" = "ID")) %>% arrange(Veh, Pert)
# 按ID分组并选择第一个值
result_df <-
  left_joined_df %>% group_by(ID) %>% summarize(
    .groups = 'keep',
    Per = first(Per),
    Gu = first(Gu),
    Ta = first(Ta),
    Veh = first(Veh),
    Pert = first(Pert),
    Ti = first(Ti),
    ID1 = first(ID1)
  ) %>% ungroup()

这是你提供的R代码的翻译版本，仅包括代码部分，没有其他内容。

英文:

I've done it this way:

left_join then arrange based on Veh and Pert
group_by ID and picking first values

library(dplyr)
dat1 &lt;- read.table(
  header = TRUE,
  text = &quot;
ID  Per  Gu  Ta
1123    112301  14  13
                   1124 112401  14  19
                   1125 112501 29  25
                   1126 112601  22  20
                   &quot;
)
dat2 &lt;- read.table(
  header = TRUE,
  text = &quot;
ID  Veh  Pert  Ti  ID1
1123    1 1 100 11231
                   1123 2 1 110 11232
                   1124 1 1 107 11241
                   1124 2 1 111 11242
                   1124 3 2 109 11243
                   1125 2 2 118 11251
                   1125 3 3 113 11252
                   1125 4 1 108 11253
                   1126 3 4 119 11265
                   1126 3 1 112 11268
                   &quot;
)
# left join or order by Veh then Pert
left_joined_df &lt;-
  dat1 %&gt;% left_join(dat2, by = c(&quot;ID&quot; = &quot;ID&quot;)) %&gt;% arrange(Veh, Pert)
# group by ID and picking first values
result_df &lt;-
  left_joined_df %&gt;% group_by(ID) %&gt;% summarize(
    .groups = &#39;keep&#39;,
    Per = first(Per),
    Gu = first(Gu),
    Ta = first(Ta),
    Veh = first(Veh),
    Pert = first(Pert),
    Ti = first(Ti),
    ID1 = first(ID1)
  ) %&gt;% ungroup()

答案2

得分: 1

dat1 |>
left_join(dat2 |>
arrange(Veh, Pert) |>
slice(1, .by = ID))

连接中 by = join_by(ID) 的部分：
ID Per Gu Ta Veh Pert Ti ID1
1 1123 112301 14 13 1 1 100 11231
2 1124 112401 14 19 1 1 107 11241
3 1125 112501 29 25 2 2 118 11251
4 1126 112601 22 20 3 1 112 11268

英文:

dat1 |&gt;
  left_join(dat2 |&gt;
              arrange(Veh, Pert) |&gt;
              slice(1, .by = ID))
Joining with `by = join_by(ID)`
    ID    Per Gu Ta Veh Pert  Ti   ID1
1 1123 112301 14 13   1    1 100 11231
2 1124 112401 14 19   1    1 107 11241
3 1125 112501 29 25   2    2 118 11251
4 1126 112601 22 20   3    1 112 11268

答案3

得分: 1

以下是翻译好的部分：

容易完成。
library(tidyverse)
dat_join <- left_join(dat1, dat2, by = "ID") %>%
  arrange(Veh) %>%
  arrange(Pert)
dat_join

英文:

Easily done.

library(tidyverse)
dat_join &lt;- left_join(dat1, dat2, by = &quot;ID&quot;) %&gt;%
  arrange(Veh) %&gt;%
  arrange(Pert)
dat_join

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在dplyr中的条件左连接

问题

答案1

答案2

答案3

Regex for determining if any characters were input besides allowed input, and correcting the input accordingly

使用Rvest进行网页抓取

将一个列表的列表转换为不同排列的列表，以供pmap使用。

从具有分组变量的数据框中随机抽取行的样本。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论