2023年2月16日 07:46:14go评论142阅读模式

英文:

Count how many times each string from a column appear (no exact match) in another column in R

问题

我的数据如下：

df <- data.frame(id = c("p3", "p5", "p8", "p9", "p10", "p11"), pedi = c("p1/p2", "p3/p4", "p3/p5", "(p3/p4)/p5", "p5/p8", "p4/p10"))

我正在尝试这样做：

id <- df$id
for (i in length(id)) {
  df$id_in_pedi <- sum(grepl(i, df$pedi))
}

但它不起作用。我想要的结果是这样的：

df <- data.frame(id = c("p3", "p5", "p8", "p9", "p10", "p11"),
                 pedi = c("p1/p2", "p3/p4", "p3/p5", "(p3/p4)/p5", "p5/p8", "p4/p10"),
                 id_in_pedi = c(3, 3, 1, 0, 1, 0))

谢谢。

英文:

My data looks like this df <- data.frame(id = c("p3", "p5", "p8", "p9", "p10", "p11"), pedi = c("p1/p2", "p3/p4", "p3/p5", "(p3/p4)/p5", "p5/p8", "p4/p10")) I am trying this

id &lt;- df$id 
for (i in length(id)) {
  df$id_in_pedi &lt;- sum(grepl(i, df$pedi))
}

But it does not work. The result I am looking for is this:

df &lt;- data.frame(id = c(&quot;p3&quot;, &quot;p5&quot;, &quot;p8&quot;, &quot;p9&quot;, &quot;p10&quot;, &quot;p11&quot;),
                 pedi = c(&quot;p1/p2&quot;, &quot;p3/p4&quot;, &quot;p3/p5&quot;, &quot;(p3/p4)/p5&quot;, &quot;p5/p8&quot;, &quot;p4/p10&quot;),
                 id_in_pedi = c(3,3,1,0,1,0))

Thanks

答案1

得分: 3

在tidyverse中：

library(tidyverse)
df %>%
  mutate(id_in_pedi = str_count(toString(pedi), id))

在Base R中，使用sapply：

transform(df, id_in_pedi = colSums(sapply(id, grepl, pedi, USE.NAMES = FALSE)))

或者使用Vectorize：

colSums(Vectorize(grepl)(df$id, list(df$pedi)))

翻译完成。

英文:

In tidyverse:

library(tidyverse)
df %&gt;%
   mutate(id_in_pedi = str_count(toString(pedi), id))

   id       pedi id_in_pedi
1  p3      p1/p2          3
2  p5      p3/p4          3
3  p8      p3/p5          1
4  p9 (p3/p4)/p5          0
5 p10      p5/p8          1
6 p11     p4/p10          0

in Base R:
Using sapply:

transform(df, id_in_pedi = colSums(sapply(id, grepl, pedi, USE.NAMES = FALSE)))

   id       pedi id_in_pedi
1  p3      p1/p2          3
2  p5      p3/p4          3
3  p8      p3/p5          1
4  p9 (p3/p4)/p5          0
5 p10      p5/p8          1
6 p11     p4/p10          0

Using Vectorize:

colSums(Vectorize(grepl)(df$id, list(df$pedi)))
 p3  p5  p8  p9 p10 p11 
  3   3   1   0   1   0

答案2

得分: 0

使用base R

table(factor(unlist(strsplit(df$pedi, "[/()]")), levels = df$id))

输出

p3  p5  p8  p9 p10 p11 
3   3   1   0   1   0

英文:

Using base R

 table(factor(unlist(strsplit(df$pedi, &quot;[/()]&quot;)), levels = df$id))

-output

  p3  p5  p8  p9 p10 p11 
  3   3   1   0   1   0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Count how many times each string from a column appear (no exact match) in another column in R

问题

答案1

答案2

为什么在函数内部使用`coalesce`时才能与`across`一起工作？

在Julia中，使用DataFrames在多个条目上分享一个值。

如何在Python中操作数据框？

如何将数据框 <class "str"> 转换为数据框

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论