2023年7月27日 19:35:09go评论104阅读模式

英文:

Create count table for specific condition and then add column that creates count by group as a whole in R

问题

我有这样的表格：

data1 <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter", "Filter", 
"No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds","Exceeds", NA, "NL", "Exceeds"))

我想创建一个计数表格，按 State 和 Filter 分组，然后计算 Threshold 为 "Exceeds" 的次数，并将该值放入一个新列中。然后计算某个 State 和 Filter 组合出现的次数，并将其放入另一列。我想要的示例如下：

final_data <- data.frame("State" = c("NJ", "NJ", "PA", "NJ", "TX"), "Filter" = c("Filter", 
"Filter", "No Filter", "Filter", "Filter"), "Threshold" = c("Exceeds", "Exceeds", NA, "NL", 
"Exceeds"), Count_Exceeds_Threshold = c(2, 2, 0, 0, 1), Count_Total = c(3, 3, 1, 3, 1))

我尝试使用 dplyr 中的 group_by 和 tally()，但无法按照我想要的方式工作。

谢谢！

英文:

I have a table like this:

data1 &lt;- data.frame(&quot;State&quot; = c(&quot;NJ&quot;, &quot;NJ&quot;, &quot;PA&quot;, &quot;NJ&quot;, &quot;TX&quot;), &quot;Filter&quot; = c(&quot;Filter&quot;, &quot;Filter&quot;, 
&quot;No Filter&quot;, &quot;Filter&quot;, &quot;Filter&quot;), &quot;Threshold&quot; = c(&quot;Exceeds&quot;,&quot;Exceeds&quot;, NA, &quot;NL&quot;, &quot;Exceeds&quot;))

I'd like to create a count table that groups by State and Filter and then counts the number of times the Threshold is "Exceeds" and put that value in a new column. And then count the number of times a certain State and Filter combination occurs and put it in another column.
An example of what I'm looking for is below.

final_data &lt;- data.frame(&quot;State&quot; = c(&quot;NJ&quot;, &quot;NJ&quot;, &quot;PA&quot;, &quot;NJ&quot;, &quot;TX&quot;), &quot;Filter&quot; = c(&quot;Filter&quot;, 
&quot;Filter&quot;, &quot;No Filter&quot;, &quot;Filter&quot;, &quot;Filter&quot;), &quot;Threshold&quot; = c(&quot;Exceeds&quot;, &quot;Exceeds&quot;, NA, &quot;NL&quot;, 
&quot;Exceeds&quot;), Count_Exceeds_Threshold = c(2, 2, 0, 0, 1), Count_Total = c(3, 3, 1, 3, 1))

I've tried figuring this out with group_by and tally() in dplyr, but I can't get it to work the way I want.

Thank you!!

答案1

得分: 2

你可以同时使用add_count()来实现两个目的：

library(dplyr)
data1 %>%
  group_by(State, Filter) %>%
  add_count(wt = (Threshold == "Exceeds"), name = "Count_Exceeds_Threshold") %>%
  add_count(name = "Count_Total") %>%
  ungroup()
# # 一个 tibble: 5 × 5
#   State Filter    Threshold Count_Exceeds_Threshold Count_Total
#   <chr> <chr>     <chr>                       <int>       <int>
# 1 NJ    Filter    Exceeds                         2           3
# 2 NJ    Filter    Exceeds                         2           3
# 3 PA    No Filter NA                              0           1
# 4 NJ    Filter    NL                              2           3
# 5 TX    Filter    Exceeds                         1           1

英文:

You can use add_count() for both purposes:

library(dplyr)
data1 %&gt;%
  group_by(State, Filter) %&gt;%
  add_count(wt = (Threshold == &quot;Exceeds&quot;), name = &quot;Count_Exceeds_Threshold&quot;) %&gt;%
  add_count(name = &quot;Count_Total&quot;) %&gt;%
  ungroup()
# # A tibble: 5 &#215; 5
#   State Filter    Threshold Count_Exceeds_Threshold Count_Total
#   &lt;chr&gt; &lt;chr&gt;     &lt;chr&gt;                       &lt;int&gt;       &lt;int&gt;
# 1 NJ    Filter    Exceeds                         2           3
# 2 NJ    Filter    Exceeds                         2           3
# 3 PA    No Filter NA                              0           1
# 4 NJ    Filter    NL                              2           3
# 5 TX    Filter    Exceeds                         1           1

答案2

得分: 1

你可以使用mutate和.by进行内联分组，并计算Threshold == "Exceeds"出现的次数。n()用于获取每个分组的行数。

library(dplyr)
data1 %>%
  mutate(Count_Exceeds_Threshold = sum(Threshold == "Exceeds", na.rm = TRUE),
         Count_Total = n(), .by = c(State, Filter))
#   State    Filter Threshold Count_Exceeds_Threshold Count_Total
# 1    NJ    Filter   Exceeds                       2           3
# 2    NJ    Filter   Exceeds                       2           3
# 3    PA No Filter      <NA>                       0           1
# 4    NJ    Filter        NL                       2           3
# 5    TX    Filter   Exceeds                       1           1

英文:

You can use mutate and .by for inline grouping, and count the number of times Threshold == "Exceeds". n() is used to get the number of rows by group.

library(dplyr)
data1 %&gt;% 
  mutate(Count_Exceeds_Threshold = sum(Threshold == &quot;Exceeds&quot;, na.rm = TRUE),
         Count_Total = n(), .by = c(State, Filter))
#   State    Filter Threshold Count_Exceeds_Threshold Count_Total
# 1    NJ    Filter   Exceeds                       2           3
# 2    NJ    Filter   Exceeds                       2           3
# 3    PA No Filter      &lt;NA&gt;                       0           1
# 4    NJ    Filter        NL                       2           3
# 5    TX    Filter   Exceeds                       1           1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

创建特定条件的计数表，然后在R中添加一个列，该列按整体分组计数。

问题

答案1

答案2

为什么这个网络图在R中不能保持同一组的节点在一起？

R Shiny App generate tabPanel in lapply (and unlist behaviour)

在R中如何创建一个函数，只有在达到阈值时才调用一个名称？

返回每次旋转的阿基米德螺旋弯曲给定臂间距和总长度的弧长

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。